multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 21:39:54 +02:00

Author	SHA1	Message	Date
Naiyuan Qing	906f70a3e2	Add comment trigger preview suppression (#3792 ) * Add comment trigger preview suppression Co-authored-by: multica-agent <github@multica.ai> * Use TanStack Query for trigger preview Co-authored-by: multica-agent <github@multica.ai> * Test note comments skip create triggers Co-authored-by: multica-agent <github@multica.ai> * feat(issues): redesign comment trigger chips as avatar chips Single agent renders as avatar + presence dot + full sentence; several agents collapse to an overlapping stack + active count, mirroring the header working chip. Per-agent skip moves into a click-opened popover (hover layers stay read-only tooltips); suppression reads as brightness, not a ban glyph. Loading and preview errors render nothing. Also: share one tooltip body across chip and popover rows, invalidate cached previews after a comment lands (the enqueued task changes the dedup answer), move the preview query key into issueKeys, and drop the now-unconsumed status field from useCommentTriggerPreview. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(server): drop comment trigger wrappers kept only for tests enqueueMentionedAgentTasks and shouldEnqueueSquadLeaderOnComment had no production callers after the compute/enqueue split — the comment path goes through computeCommentAgentTriggers. Tests now exercise the compute functions directly via package-local helpers, so the legacy adapters cannot drift from the real path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(skills): sync mentioning/squads source maps with shared trigger computation The squads source map still pointed the comment-trigger contract at the pre-refactor call chain (comment.go:940 -> shouldEnqueueSquadLeaderOnComment), and the mentioning skill referenced the deleted wrapper. Re-anchor both to computeCommentAgentTriggers / computeAssignedSquadLeaderCommentTrigger / computeMentionedAgentCommentTriggers with current line numbers. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 16:27:07 +08:00
Bohan Jiang	998ebe97e4	fix(autopilot): fail create-issue runs on any terminal task failure (#3943 ) Generalize SyncRunFromLinkedIssueTask beyond Codex no-progress: any terminal create-issue task failure with no retry still in flight now fails the linked autopilot run, so it can no longer hang in issue_created (invisible to the failure-rate auto-pause monitor). - fail the linked run for any terminal task failure, gated by the existing HasActiveTaskForIssue wait-for-retry guard - remove the isNoProgressTaskFailure classifier (subsumed; drops duplicated pkg/agent marker literals) - drop the redundant GetIssue/origin lookup; GetAutopilotRunByIssue leads and short-circuits ordinary failures in one query - tests: keep no-progress regression, add agent_error (non-retryable) and retry-pending cases Follow-up to #3927. VEN-661 / VEN-662 / MUL-3164	2026-06-09 14:48:20 +08:00
stevenayl	ee6200de25	fix(autopilot): fail no-progress issue runs (#3927 ) Fail create-issue autopilot runs that hang in issue_created after a Codex no-progress / semantic-inactivity task failure, so they surface as failed and count toward the failure-rate auto-pause monitor. - route failed create-issue issue tasks (no direct autopilot_run_id) into linked run sync - fail linked runs only for Codex no-progress / semantic-inactivity failures - wait when an active retry task still exists for the issue - add classifier coverage + a DB-backed listener regression VEN-661 / VEN-662 / MUL-3164	2026-06-09 14:25:04 +08:00
Bohan Jiang	24b162cdbc	feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) (#3899 ) * feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) In a multi-person workspace the agent runtime only ever saw the runtime OWNER identity: the brief's `## Requesting User` is sourced from runtime.OwnerID and the task-scoped token is owner-bound, so every requester (whoever commented, @mentioned, or chatted) appeared to the agent as the owner. Agents that route by initiator for permission, privacy, or audit all misjudged. Resolve the real task initiator at claim time and surface it distinctly from the owner: - comment / mention trigger -> triggering comment's author (member or agent) - chat task -> chat session creator (sessions are creator-only) - on-assign / autopilot / quick-create -> no attributable initiator (omitted) Adds initiator_{type,id,name,email} to the claim response, the daemon Task, and TaskContextForEnv, rendered into the brief as a new `## Task Initiator` section. The section documents the privacy boundary: the agent's credentials stay owner-scoped, so this is an attested identity for the agent's own routing/privacy logic, not act-as. No DB migration — both paths are derivable from existing rows. Tests: brief rendering (member/agent/omit/sanitize) + email guard unit tests, and claim-handler tests for the comment and chat paths. Co-authored-by: multica-agent <github@multica.ai> * fix(chat): store real sender as task initiator, not chat_session creator (MUL-2645) Review fix (Niko, PR #3899). v1 resolved the chat task initiator from chat_session.creator_id at claim time. That is correct for web chat and Lark p2p (creator == sender), but WRONG for Lark group chats: the group session creator is deliberately the installer (stable identity across member churn), not the message sender. So in a Lark group, every member who triggered the agent showed up in the brief as the installer/owner — the exact bug this issue is about, still live at that entry point. Capture the real sender at enqueue time instead of deriving it from the session creator at claim time: - migration 117: agent_task_queue.initiator_user_id (FK user, ON DELETE SET NULL); NULL for non-chat and pre-migration rows. - EnqueueChatTask now takes an explicit initiatorUserID. Web chat passes the authenticated request user; the Lark dispatcher threads the inbound sender (binding.MulticaUserID) through scheduleRun -> flushChatRun. The debouncer keeps the latest scheduled flush per session, so in a multi- sender silence window the LATEST sender wins (documented + tested). - claim handler resolves the initiator from task.initiator_user_id and drops the creator_id fallback entirely. The Lark group session creator stays the installer (unchanged) — only the task initiator is corrected, keeping the two concepts cleanly separate. Tests: dispatcher group regression (initiator = sender, not installer), latest-sender-wins, p2p initiator assertion; the chat claim handler test now sets creator != initiator and asserts the stored sender wins. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-08 19:29:57 +08:00
HMYDK	4190de3d64	fix(skills): quote description values in built-in SKILL.md YAML frontmatter (#3852 ) Built-in SKILL.md description values contained unquoted ': ' sequences, which strict YAML parsers (e.g. Codex) reject — silently dropping the skill at load. - Quote all eight built-in skill descriptions. - ensureSkillFrontmatter() re-synthesizes frontmatter that has a name but fails YAML validation, so malformed imports are repaired instead of dropped. - Unify frontmatter delimiter parsing into a single frontmatterParts helper. - Add strict-YAML regression tests over the built-in skills, plus unit tests for the recovery branch and delimiter variants. Closes #3851.	2026-06-08 13:10:24 +08:00
Bohan Jiang	62925b97f1	chore(cli): remove the --from-template flag from agent create (#3805 ) * chore(cli): remove the --from-template flag from agent create The `--from-template` CLI flag was an untaught, immature surface (the built-in skill's source-map explicitly marked the template path "out of scope"). It also silently ignored sibling create flags (--custom-env, --mcp-config, etc.) by short-circuiting before body assembly. Remove the flag and its runAgentCreateFromTemplate handler from the CLI. Scope is CLI-only. The agent-template product feature stays intact: - registry server/internal/agenttmpl/ (embedded curated templates) - handler server/internal/handler/agent_template.go - routes GET /api/agent-templates, GET /api/agent-templates/{slug}, POST /api/agents/from-template - the onboarding "create from template" flow (packages/views/onboarding) The onboarding flow calls the API directly and does not depend on the CLI flag, so removing the flag does not affect it. Updates the multica-creating-agents source map accordingly. MUL-3070 Co-authored-by: multica-agent <github@multica.ai> * fix: correct source-map note on agent-template usage + guard --from-template Review of #3805 (MUL-3070) flagged a factual error in the source-map note: it claimed onboarding uses the agent-template backend. It does not. `packages/views/onboarding/steps/step-agent.tsx` builds four hardcoded local presets (i18n-resolved) and creates via plain `POST /api/agents` (`createAgent`), never `POST /api/agents/from-template`. The whole agent-template stack (registry, handler, routes, `packages/core` client + query wrappers) is orphaned — the removed CLI flag was its only non-test caller. Rewrite the note to say so. Also add a regression test asserting `agent create` exposes no `--from-template` flag, so it can't be silently re-added. MUL-3070 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-05 14:21:11 +08:00
Bohan Jiang	18a5224fe8	feat(cli): add --mcp-config flags to agent create/update (#3799 ) Agents already support an mcp_config field (consumed by the daemon → provider at task time) and the agent-settings UI exposes an MCP tab, but the CLI had no way to set it. This adds the missing CLI surface, mirroring the existing custom-env pattern: - `agent create` and `agent update` gain --mcp-config / --mcp-config-stdin / --mcp-config-file. The stdin/file channels keep MCP server tokens out of shell history and 'ps'; the three channels are mutually exclusive. - The value is validated as a JSON object (or the literal `null` to clear, on update), matching the agent-settings MCP tab. Empty stdin/file input errors instead of silently clearing a secret-bearing field. - Unlike custom_env, mcp_config IS settable via `agent update` — it is persisted through the generic UpdateAgent endpoint (no dedicated audited endpoint), so both create and update expose the flags. Adds parser/resolver unit tests (incl. secret-leak sanitization) and updates the multica-creating-agents built-in skill + source map. MUL-3070 Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-05 13:39:38 +08:00
Naiyuan Qing	7fdec9e6e4	Teach default PR handoff in issue skill (#3753 ) Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 15:27:00 +08:00
Bohan Jiang	8db619c1cd	fix(email): wire SMTP_EHLO_NAME through self-host config + docs [MUL-2984] (#3749 ) * fix(email): wire SMTP_EHLO_NAME through self-host config + docs Follow-up to #3679, which added SMTP_EHLO_NAME in code but never exposed it to operators. - docker-compose.selfhost.yml: pass SMTP_EHLO_NAME through to the backend container. The compose env block is an explicit allowlist, so without this the override set in .env was silently dropped and never reached the process — making the escape hatch unusable on the docker path. - Document the var alongside its SMTP_* siblings: .env.example, SELF_HOSTING_ADVANCED.md, environment-variables.mdx, auth-setup.mdx, and self-host-quickstart.mdx (the last two with a strict-relay example). - email.go: log when os.Hostname() fails instead of silently falling back to net/smtp's lazy "localhost" — the exact greeting strict relays reject. - Add TestNewEmailService_EHLOName covering the env override, trimming, and the hostname fallback. MUL-2984 Co-authored-by: multica-agent <github@multica.ai> * fix(email): gate EHLO resolution to SMTP mode + sync docs to zh/ja/ko Addresses review nits on this PR: - email.go: resolve smtpEHLOName only when SMTP_HOST is set, so the Resend / DEV-stdout paths never call os.Hostname() or emit its failure log. The EHLO name is only ever used on the SMTP send path. - docs: add SMTP_EHLO_NAME to the zh/ja/ko variants of environment-variables, self-host-quickstart, and auth-setup, in sync with the English docs updated earlier in this PR. Note: the ja/ko self-host-quickstart and auth-setup pages were already missing the port-465 implicit-TLS example (pre-existing i18n drift from an earlier SMTP_TLS change, unrelated to this PR); the new EHLO block is inserted at the correct logical anchor regardless. A full ja/ko re-sync is left as a separate follow-up. MUL-2984 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 14:44:55 +08:00
Arnaud Dezandee	a9f0739b52	fix(email): set EHLO hostname for SMTP relay compatibility (#3679 )	2026-06-04 14:07:47 +08:00
Bohan Jiang	b92c3fbc93	chore(analytics): stop shipping operational events to PostHog (MUL-2967) (#3720 ) * chore(analytics): stop shipping operational events to PostHog (MUL-2967) Operational / execution-lifecycle telemetry dominated PostHog event volume and drove the bill: runtime_offline alone was ~54% of ~22.6M events/mo, and ~99% of events were billed at the higher identified-event rate. These signals already have Prometheus counters (Grafana), so the PostHog copies were redundant cost. - Add analytics.IsMetricsOnly; metrics.RecordEvent now skips the PostHog Capture for runtime_* and autopilot_run_* while still incrementing their Prometheus counter (their analytics.Event constructors are retained to feed the metric label set via IncForEvent). - Remove the agent_task_* PostHog path entirely: drop captureTaskEvent and the AgentTask* constructors/constants. Their Prometheus side is unchanged via the typed BusinessMetrics.RecordTask* methods. Also remove the now-dead taskDurationMS / willRetryTask helpers. - Update the pairing lint test (no agent_task allow-list, no naked-Capture exception), add a RecordEvent skip test + IsMetricsOnly test, and update docs/analytics.md (taxonomy, per-event banners, reconciliation). Product/funnel events (signup, onboarding, issue_created, issue_executed, chat_message_sent, agent_created, autopilot_created, etc.) are unchanged and still ship to PostHog. Co-authored-by: multica-agent <github@multica.ai> * docs(analytics): correct agent_task Prometheus metric contract (MUL-2967) Address PR review: the agent_task_* "Prometheus-only" banner claimed the old PostHog event properties (task_id, agent_id, duration_ms, error_type, will_retry, ...) were the metric label set. They are not — the real labels are only source/runtime_mode/provider/terminal_status/failure_reason. - Replace the agent_task_* sections with the actual metric names and labels (multica_agent_task_; see business.go / labels.go), and explain that completed/failed/cancelled are terminal_status values on multica_agent_task_terminal_total, with wall-clock in the _seconds histograms. - Tighten the runtime_/autopilot_run_ banners so id properties aren't mistaken for labels. - Drop the stale AgentTask allow-list reference from the pairing lint test header comment. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 00:48:17 +08:00
Bohan Jiang	8c98940b79	Lark Bot integration MVP: migration + service boundary (MUL-2671) (#3277 ) * feat(db): add Lark integration migration (MUL-2671) Introduces seven tables for the 飞书 Bot integration MVP — per-agent PersonalAgent installations, user/chat bindings, inbound dedup + non-content drop audit, outbound card mapping, and short-lived single-use member binding tokens. Schema notes: - chat_session schema unchanged; Lark routes through a separate binding table rather than adding a metadata JSONB column. - Outbound card mapping is task/message scoped so multiple runs on the same session can't stomp each other's cards. - lark_inbound_audit stores routing / identity / drop_reason ONLY, never message body — the audit channel for unbound users and group messages that don't address the Bot. - app_secret stores ciphertext (encryption helper lands in a follow-up commit on this branch); DB never sees plaintext. Co-authored-by: multica-agent <github@multica.ai> * feat(util): add secretbox AES-256-GCM helper for at-rest secrets First consumer is lark_installation.app_secret (MUL-2671 §4.4), but the helper is intentionally generic — future per-tenant secrets that must not appear in a DB dump can reuse it. Construction: AES-256-GCM with a per-message random nonce, providing authenticated encryption. Tampered ciphertext fails Open instead of silently decrypting to garbage. Master key loaded from a base64 env var via LoadKey; key rotation is not in scope yet. Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): extract IssueService.Create as single create entry (MUL-2671) Establishes the service-layer boundary mandated by Elon's 二审 of MUL-2671 §4.8: issue creation no longer lives inside the HTTP handler. Both the HTTP POST /issues handler and the future Lark /issue command call into service.IssueService.Create, so duplicate guard, issue numbering, attachment linking, broadcast, analytics, and agent/squad enqueue stay aligned. Handler responsibilities shrink to parsing the HTTP request, doing actor resolution / validation (transport-specific), and converting service results into the IssueResponse + 201. The transaction-wrapped core, attachment link, event publish, analytics capture, and agent/squad enqueue all move into service.IssueService.Create. A BroadcastPayload callback on the service keeps the WS broadcast shape (the full IssueResponse) without forcing the service to depend on handler-layer response types. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations): add Lark package skeleton (MUL-2671) Establishes the architectural boundaries Elon's 二审 mandated as first-PR blockers without dragging in OAuth, WebSocket, or card-patching code (those land in follow-up PRs): - ChatSessionService interface — channel-aware chat-session entry point for Lark, deliberately separate from the HTTP SendChatMessage handler. The HTTP handler's single-creator guard (creator_id == request user_id) is correct for the browser client but rejects group chat_sessions by construction; Lark needs its own service. - AuditLogger interface — the only path for recording dropped events. Its signature deliberately omits message body, enforcing the drop-audit policy (MUL-2671 §4.7) at the type level: unbound users and non-addressed group messages can't accidentally end up in chat_session. - Typed IDs (OpenID, ChatID) prevent UUIDs from being conflated with Lark-side identifiers at compile time. - DropReason constants align dashboard/audit queries across callers. Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): move parent/project workspace check into IssueService (MUL-2671) Parent existence and project workspace membership now live inside IssueService.Create, inside the same transaction as the duplicate guard and counter increment. The HTTP handler stops re-implementing the lookup; every future create entry (Lark /issue, MCP, API keys) inherits the same boundary without copy-pasting the SQL. Adds two error sentinels (ErrParentIssueNotFound, ErrProjectNotFound) so transports can translate to their own error shapes. Handler-level cross-workspace tests guard the boundary against future regressions. Co-authored-by: multica-agent <github@multica.ai> * fix(db): harden Lark migration safety底座 — TTL cap + workspace FK (MUL-2671) Two storage-layer hardenings that move the must-fix line off "the app layer enforces it" and onto the schema itself, so future write paths or hand-inserted rows cannot regress the invariants. 1) lark_binding_token TTL cap. The DB CHECK was 1 hour as defense-in-depth while the app constant was 15 minutes; the CHECK now matches the product cap (15 minutes). Application constant docstring updated to reflect that storage enforces the same bound. 2) lark_user_binding workspace membership. The table previously only FK'd to workspace / user / installation independently, so a binding could exist for a user no longer in the workspace, or claim a workspace different from its installation's. Two composite FKs close the gap structurally: * (installation_id, workspace_id) → lark_installation(id, workspace_id) — guarantees a binding's workspace_id always matches its installation's workspace_id. A new UNIQUE (id, workspace_id) on lark_installation is added as the FK target. * (workspace_id, multica_user_id) → member(workspace_id, user_id) with ON DELETE CASCADE — when a user is removed from the workspace, the binding cascades away in the same transaction. There is no longer a path where lark_user_binding outlives workspace membership. These two FKs are the schema-level proof for §4.3's "unbound or non-workspace members cannot leak content into chat_session" invariant. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): inbound services + /issue dispatcher (MUL-2671) Lands the inbound service layer for the Lark Bot MVP, sitting on top of the migration + service-boundary scaffold from the previous commits. What ships: - sqlc queries for all seven lark_* tables (idempotent dedup insert, CAS WS-lease, single-use binding-token consume, etc.) plus GetMostRecentUserChatMessage for the /issue fallback. - AuditLogger backed by lark_inbound_audit; signature deliberately body-free so callers cannot leak content into the drop log. - ChatSessionService: find-or-create chat_session via the binding table (winner-takes-all on the UNIQUE race), append-with-dedup, /issue parser, "previous user message" fallback for bare `/issue` invocation. - Dispatcher orchestrates the inbound pipeline in one place: installation routing → group-mention filter → identity check → ensure session → append+dedup → /issue → enqueue chat task. Group sessions use the installer as creator (stable workspace identity); p2p uses the sender. Agent-offline path falls through with OutcomeAgentOffline so the WS adapter can reply with the offline notice from §4.6. - BindingTokenService: random URL-safe token, SHA-256 stored hash, 15-min TTL pinned at the application AND the DB CHECK; Redeem returns the same opaque error for all rejection cases (no timing oracle on replay). - Unit tests for the parser (13 cases), dispatcher (8 cases via fake Queries/Chat/Audit/IssueCreator/Enqueuer), and binding-token hash/entropy. Real-DB integration tests for OAuth + token redeem land alongside the HTTP handlers in the next commit. Out of scope for this commit (next ones on the same feature branch): OAuth callback, HTTP routes, WebSocket hub, outbound card patcher, frontend. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): installation HTTP surface + secretbox-gated wiring (MUL-2671) Lands the HTTP boundary on top of the inbound services from the previous commit. What ships: - InstallationService.Upsert: the only path that writes lark_installation. Encrypts app_secret with the secretbox passed in at construction time; refuses to fall back to plaintext storage (returns an error from the constructor if no Box is supplied), so a misconfigured dev environment cannot accidentally land a row with cleartext credentials. Revoke flips status without DELETE so audit trail survives. - HTTP handlers under /api/workspaces/{id}/lark/: * GET /installations — member-visible (Integrations tab renders for non-admins). Soft 200 with empty list + configured:false when MULTICA_LARK_SECRET_KEY is unset, so the tab does not error on self-host that has not opted in. * POST /installations — admin-only; 503 when not configured. Re-validates agent_id ∈ workspace before accepting credentials so a cross-workspace agent UUID is rejected. * DELETE /installations/{id} — admin-only; workspace-scoped lookup so one workspace cannot revoke another's installation by UUID guess. - POST /api/lark/binding/redeem (user-scoped, no workspace context): the only path that mints a lark_user_binding row from user action. Redeemer identity comes from the session, not the token, so a stolen link cannot bind an open_id to an attacker's Multica user. The composite FK on lark_user_binding cascades the binding away if the user is not (or no longer) a workspace member, so a non-member who steals the link gets 403 at the DB layer. - Two new event-bus types in protocol.events: EventLarkInstallationCreated, EventLarkInstallationRevoked. - Router wiring: MULTICA_LARK_SECRET_KEY drives a conditional initialization of h.LarkInstallations + h.LarkBindingTokens. When unset, the integration disables itself with an INFO log and the rest of the server boots normally. - Handler tests cover all four not-configured short-circuits. Happy-path integration tests (real DB, full create→list→revoke cycle and token mint→redeem) ship alongside the WS hub PR. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): close binding-token rebind & typed task errors (MUL-2671) Two must-fixes from PR review on HEAD `87ad15e1`: 1. Binding-token redeem could be used to grab an already-bound Lark open_id. Two changes harden the path: - lark.sql `CreateLarkUserBinding` now gates ON CONFLICT DO UPDATE on `multica_user_id = EXCLUDED.multica_user_id`, so a cross-user rebind via a second valid token returns zero rows instead of silently switching ownership. - `BindingTokenService.RedeemAndBind` consumes the token and writes the binding row inside one transaction. A failed bind no longer burns the token; a successful bind never leaves a consumed-but- unused token. Distinct typed errors: ErrBindingTokenInvalid (410), ErrBindingAlreadyAssigned (409), ErrBindingNotWorkspaceMember (403). The handler maps each to its own status code. 2. Dispatcher collapsed every `EnqueueChatTask` error to `OutcomeAgentOffline`, hiding infra failure and misusing the "offline" label for cases (e.g. archived agent) where it doesn't fit. Now: - `service.EnqueueChatTask` returns `ErrChatTaskAgentNoRuntime` and `ErrChatTaskAgentArchived` as sentinel errors; DB / load / insert failures stay wrapped as ordinary errors. - Dispatcher uses `errors.Is` to map only the productizable cases (`OutcomeAgentOffline`, new `OutcomeAgentArchived`); any other error is returned to the WS adapter so it can retry or page instead of disguising the outage as an offline card. A daemon that's merely disconnected is still NOT an error — as long as `agent.runtime_id` is set the chat task enqueues and waits for the daemon to claim it on next online (returns `OutcomeIngested`). Co-authored-by: multica-agent <github@multica.ai> * ci: re-trigger workflow on lark MVP must-fix HEAD Co-authored-by: multica-agent <github@multica.ai> * ci: re-trigger workflow on lark MVP must-fix HEAD (retry) Co-authored-by: multica-agent <github@multica.ai> * test(integrations/lark): guard binding-token sentinel contract (MUL-2671) Two unit tests that document and protect the must-fix invariants without requiring a DB: 1. TestRedeemAndBindRequiresTxStarter — if a future refactor wires up BindingTokenService without a TxStarter, RedeemAndBind must fail fast with a clear error rather than nil-panic on Begin. The atomicity contract (consume + bind commit together) depends on that transaction existing. 2. TestBindingErrorSentinelsAreDistinct — the HTTP handler maps ErrBindingTokenInvalid → 410, ErrBindingAlreadyAssigned → 409, ErrBindingNotWorkspaceMember → 403. Accidentally aliasing them (e.g. var ErrBindingAlreadyAssigned = ErrBindingTokenInvalid) would silently regress the response codes without any other test catching it. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): WS hub orchestrator + outbound card patcher (MUL-2671) The hub owns one supervisor goroutine per active installation. Each supervisor acquires the WS lease via the existing CAS query, runs an EventConnector (interface — real Lark wire protocol lands in a follow-up behind it), renews the lease on a tighter cadence than the TTL, and backs off (with jitter) on connector failure. Lease loss tears the connector down cleanly; revocation is reaped on the next sweep. Per- process node id satisfies §4.4 multi-replica safety: at most one Hub globally holds the lease for any installation. The patcher subscribes to task / chat-done events on the existing events.Bus and keeps the per-task Lark interactive card in sync (thinking → streaming → final \| error). Card binding is per-task as required by §4.5; throttled patches via an in-memory last-patched map; final / error transitions bypass the throttle so the user always sees the terminal state. The Renderer is plug-replaceable so the product card template can evolve without touching transport. The APIClient interface centralizes the Lark Open Platform surface this package needs (send card, patch card, send binding prompt, exchange OAuth code). The default stubAPIClient returns ErrAPIClientNotConfigured for every transport call so a misconfigured deployment fails loudly instead of dropping cards silently. Real implementation lands in a follow-up; OAuth callback + frontend entries land in the next commits on this branch. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): OAuth install start / callback (MUL-2671) OAuthService builds a signed-state Lark authorization URL the frontend can render as a QR (or open directly), then on callback verifies the HMAC-protected state, exchanges the OAuth code for installation credentials via APIClient.ExchangeOAuthCode, and persists the row via InstallationService.Upsert (which keeps app_secret encryption inside a single chokepoint). State token format: workspaceID.agentID.initiatorID.expiresUnix.nonce.sig — HMAC-SHA256 over the first five fields with a deployment-level secret. TTL defaults to 10 minutes (covered by tests). Three failure modes (invalid state / expired state / missing code) map to typed errors so the HTTP handler can emit a single lark_error= query param the frontend uses to pick copy. Both endpoints degrade cleanly: the at-rest key gate (already in place) returns 503 from /install/start when the InstallationService is nil, and the OAuth gate (MULTICA_LARK_OAUTH_APP_ID / _SECRET / _REDIRECT_URI / _STATE_SECRET) returns configured:false from /install/start so the frontend can render "configure manually instead" without an error banner. /install/callback always finishes with a redirect to /settings?tab=lark carrying either lark_installed=1 or lark_error=<code>. Tests cover signed-URL shape, missing-config rejection, tampered state, expired state, propagated exchange error, and the no-config redirect path on the HTTP handler. Co-authored-by: multica-agent <github@multica.ai> * feat(views/lark): settings tab + agent bind button + /lark/bind redemption page (MUL-2671) Adds the user-facing Lark surface across the shared packages: - packages/core/types/lark.ts — wire shapes that mirror server/internal/ handler/lark.go. Optional fields default to undefined so older desktop builds keep parsing if the server adds new keys (CLAUDE.md → API Response Compatibility). - packages/core/lark/{queries,index}.ts — Tanstack Query options keyed by workspace id; realtime sync invalidates `installations(wsId)` on `lark_installation:` events. - packages/core/api/client.ts — listLarkInstallations, getLarkInstallURL, deleteLarkInstallation, redeemLarkBindingToken. - packages/views/settings/components/lark-tab.tsx — Settings → Lark panel. Listing is member-visible (matches backend); disconnect is admin-only. Empty state points users at the per-Agent bind entry, matching the (workspace_id, agent_id) UNIQUE: there is no "pick an agent" UI here because the bind URL is per-agent. - LarkAgentBindButton (same file) is the per-Agent CTA the Agent detail page imports. Opens the OAuth URL in a new tab; the callback bounces back to /settings?tab=lark with a query param the panel reads for inline confirmation copy. - packages/views/lark/bind-page.tsx — the Bot's "you need to bind" destination. Requires session before redeeming, distinguishes the 410/409/403 backend responses into distinct copy. - apps/web/app/lark/bind/page.tsx — Next.js route wrapping the shared bind page in a Suspense boundary (Next 15 useSearchParams rule). i18n: all user-facing strings land in en/zh-Hans, settings tab nav includes a Sparkles-iconed Lark entry, bind-page copy lives under common.lark_bind so it works pre-workspace-context too. typecheck + lint clean. Co-authored-by: multica-agent <github@multica.ai> chore(integrations/lark): wire outbound Patcher into server bootstrap (MUL-2671) Constructs the Patcher next to the existing Installation/BindingToken wiring in router.go and Register()s it on the event bus. With the stub APIClient any actual transport call surfaces ErrAPIClientNotConfigured; once the real Lark client lands, swap NewStubAPIClient for the real implementation here without touching the Patcher's subscription logic. doc.go updated to reflect everything the package now contains (Hub, Patcher, OAuthService, APIClient interface). The Hub itself is NOT booted here yet — it needs an EventConnector implementation for the Lark long-connection wire protocol, which lands in a follow-up; the orchestrator code and its unit tests are in place so that follow-up can focus on the WS protocol rather than lifecycle plumbing. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): address Elon 二审 5 must-fix items (MUL-2671) - Hub: renewer cancels run ctx on lease loss so the connector exits even if its wire I/O is blocked, keeping the §4.4 ownership invariant intact under lease theft. - Hub: EventEmitter returns (DispatchResult, error) so the real connector can post the matching Lark-side card (needs_binding, agent_offline, agent_archived) and react to infra failures instead of silently logging at the seam. - Dispatcher: top-level message_id dedup runs before group filter and identity check, so a reconnect storm cannot re-fire binding prompts or re-spam not_addressed_in_group audit rows; the in- AppendUserMessage dedup is removed since the table-level UNIQUE is the ultimate backstop. - OAuth: HandleCallback auto-binds the installer via the new InstallerBinder seam (BindingTokenService implements it), so the §2.1 "scan to bind, you're done" promise holds end-to-end. validateExchangeResult now requires installer open_id; new error reason codes wired through the callback redirect. - Frontend / handler: install_supported listing field + StartLark- Install short-circuit on stub APIClient hide install entry points (Settings tab + per-agent button) while no real Lark HTTP client is wired, so users do not land in an OAuth flow that fails at exchange. Includes tests for each fix (lease-loss cancel, emit error propagation, dedup ordering, OAuth installer-bind contract, stub- client install gate) and i18n strings for the new preview state. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): two-phase dedup so infra failures do not swallow messages (MUL-2671) The pre-fix top-level dedup wrote the lark_inbound_message_dedup row before EnsureChatSession / AppendUserMessage. An infra error in either step left the row in place and a WS-adapter retry was mis-classified as a duplicate, so the user's Lark message was permanently lost without ever landing in chat_session. Make dedup two-phase: - ClaimLarkInboundDedup acquires an in-flight claim (processed_at NULL). Stale claims older than 60 s are re-takeable so a process crash does not strand the message_id. - MarkLarkInboundDedupProcessed flips processed_at on durable success (audit row OR chat_message + session touch). - ReleaseLarkInboundDedup deletes the in-flight row on infra failure before any durable side effect, so the retry can re-claim immediately. Dispatcher.Handle now finalizes the claim exactly once based on whether the inner pipeline reached a durable outcome — chat_message commit being the transition point (errors past it Mark, errors before it Release). Regression tests cover the two failure variants Elon flagged plus the inverse invariants (durable-error Marks, drops Mark, in-flight replays drop, stale claims re-claim). Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): owner-fence dedup claim to close the double-write windows (MUL-2671) The two-phase Claim/Mark/Release fix from the previous commit closed the "infra error swallows a replay" gap but left two windows that could still write a chat_message twice for the same Lark message_id: 1. Stale-reclaim race. Worker A claims at t=0, runs slowly past the 60 s staleness TTL but is still alive. Worker B sees the row as stale and re-takes the claim. A reaches AppendUserMessage and commits a second chat_message. 2. Mark window. Worker A commits chat_message but the post-pipeline MarkLarkInboundDedupProcessed fails (DB hiccup) or the process crashes before it runs. 60 s later a retry treats the in-flight row as stale, re-claims it, and writes a second chat_message. Close both with owner fencing + same-tx Mark: - lark_inbound_message_dedup now carries a `claim_token` UUID; ClaimLarkInboundDedup mints a fresh one on insert and on stale re-take, so a reclaim ROTATES the token. - MarkLarkInboundDedupProcessed and ReleaseLarkInboundDedup are fenced on (message_id, claim_token, processed_at IS NULL) and return rowsAffected. Zero means our token is no longer live, and the caller treats it as a no-op (not an error). - AppendUserMessage invokes MarkLarkInboundDedupProcessed INSIDE its chat_message+session tx (qtx). If the token has been rotated by a concurrent reclaim, the Mark matches zero rows and the method returns ErrClaimLost; the deferred Rollback unwinds the chat_message insert, so the other holder is the sole writer. The durable write and the Mark therefore commit (or roll back) atomically — there is no "committed but not yet Marked" window for a crash or retry to exploit. Dispatcher.processClaimed now returns a tri-state dedupFinalize directive (none / mark / release): finalizeNone for the in-tx Mark path (and ErrClaimLost), finalizeMark for audit-drop branches and the defensive post-Append-success fallback, finalizeRelease for pre-durable infra errors. ErrClaimLost is translated into OutcomeDropped + DropReason- Duplicate at the Handle boundary, matching what the WS adapter expects for a "another worker is the writer" outcome. Regression tests: - TestDispatcher_StaleReclaimRaceDoesNotDoubleWrite injects worker B's reclaim via a beforeAppend hook so the claim_token rotates between Claim and AppendUserMessage. Asserts worker A's AppendUserMessage returns ErrClaimLost (no chat_message committed), the dispatcher surfaces a duplicate drop, the token rotated to a value distinct from A's original, and a follow-up replay still duplicate-drops. - TestDispatcher_InTxMarkPreventsPostCommitReclaim verifies the "Mark window" case is unreachable: a successful in-tx Mark produces exactly one Mark call (no post-finalize duplicate), the row is terminal, and a retry with dedupReclaim=true still duplicate-drops without re-rotating the token. - TestDispatcher_InTxMarkSucceedsAndSkipsPostFinalize pins the positive contract: DedupMarked=true must make applyFinalize a no-op (no extra Mark, no Release). fakeQueries gains a fakeDedupRow model carrying (processed, token, rotations) so the test seam matches production's UPDATE-with-WHERE semantics; fakeChat gains a beforeAppend hook to inject race timing. go test ./... and go vet ./... pass. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): real Lark HTTP APIClient for IM v1 send/patch (MUL-2671) Lands the production Lark Open Platform HTTP APIClient that replaces the stub for outbound transport. The patcher's "thinking → streaming → final \| error" card lifecycle and the dispatcher's binding-prompt card both now reach Lark for real once MULTICA_LARK_HTTP_ENABLED=true. Scope of this stage: - tenant_access_token retrieval via /open-apis/auth/v3/ tenant_access_token/internal, cached in-process per app_id with a 60s safety margin against Lark's `expire` value. Sub-2-minute expires are clamped to 120s so we never cache an entry that's already past its safe window. - SendInteractiveCard: POST /open-apis/im/v1/messages?receive_id_type=chat_id returning the Lark message_id the Patcher persists in lark_outbound_card_message for later patches. - PatchInteractiveCard: PATCH /open-apis/im/v1/messages/:id with the full re-rendered card body (Lark's update endpoint replaces, not deep-merges). - SendBindingPromptCard: open_id-targeted interactive card with a primary "去绑定" CTA pointing at the redemption URL. Template is co-located with the transport so the dispatcher never has to know about Lark's card schema. - Token-error invalidation: Lark codes 99991663 (expired) / 99991664 (invalid) drop the cached token so the next call refreshes from /tenant_access_token/internal instead of looping on a stale entry. Out of scope (deferred to follow-up stages): - ExchangeOAuthCode stays unimplemented behind ErrAPIClientNotConfigured. The PersonalAgent install handshake's response shape (returning per-installation app credentials in a single call) is not yet verified against the production endpoint, and a silent mis-fill of OAuthExchangeResult would corrupt lark_installation rows past validateExchangeResult. Operators continue to use the manual-paste InstallationService path until the OAuth stage lands. - Inbound WS EventConnector — Hub's ConnectorFactory still needs a real wire-protocol implementation. Wiring: - MULTICA_LARK_HTTP_ENABLED=true switches router.go from the stub to the real client. MULTICA_LARK_HTTP_BASE_URL overrides the default open.feishu.cn host (set to open.larksuite.com for the Lark international tenant, or to an httptest URL for integration tests). - The OAuth handler now also receives the real client (its ExchangeOAuthCode still surfaces ErrAPIClientNotConfigured, so callback behavior is unchanged until that stage lands). Tests (19 new cases against an httptest.Server fake): - happy path send/patch/binding-prompt round trips, asserting URL query params, body shape, Authorization header - token cache: 3 sends share one /tenant_access_token/internal hit - token refresh after clock-driven expiry - sub-margin expire clamping (10s expire → cached for >= safety margin of wall-clock) - Lark error code surfacing (230001 send, 230002 patch, 10003 auth) - token-expired (99991663) invalidates the cache; caller's retry re-fetches and succeeds - non-2xx HTTP status surfaces "http 500: …" - input validation: missing chat_id short-circuits BEFORE auth round-trip, missing card json / open_id / bind url all fail pre-flight without hitting Lark - ExchangeOAuthCode still returns ErrAPIClientNotConfigured - binding-prompt template carries the BindURL and the localized "去绑定" CTA in valid JSON go build ./..., go vet ./..., and go test ./internal/integrations/lark/... pass. Pre-existing handler/router integration tests that require a real Postgres connection are unaffected by this change. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): split outbound vs OAuth-install capability + card update_multi (MUL-2671) Address Elon's two must-fix items from the HEAD `a09993b1` review: 1. HTTP outbound and OAuth-install are now distinct APIClient capabilities. The new SupportsOAuthInstall() reports whether the install flow can succeed end-to-end (i.e. ExchangeOAuthCode is implemented); the real httpAPIClient still returns IsConfigured() = true (send / patch / binding prompt work) but SupportsOAuthInstall() = false until the PersonalAgent install-time response shape is pinned. Handler-side `install_supported` and StartLarkInstall now gate on SupportsOAuthInstall, so a half-wired client never reveals the scan-to-bind UI. larkOAuthErrorReason also maps ErrAPIClientNotConfigured to a dedicated `oauth_exchange_unimplemented` reason so a raw callback hit no longer masquerades as `internal_error`. 2. defaultRenderer now emits config.update_multi=true on every Kind. Lark refuses to apply PatchInteractiveCard to a card whose initial config doesn't declare it shared/updatable, so the absent flag would make every patch after the first send silently no-op on the wire while the local outbound status row still flipped to streaming/final. Tests cover both halves of each fix: - TestHTTPClient_SupportsOAuthInstall_FalseUntilExchangeLands + TestHTTPClient_StubReportsBothCapabilitiesFalse pin the new capability surface. - TestStartLarkInstall_TransportOnlyClientReportsNotConfigured + TestListLarkInstallations_TransportOnlyClientReportsInstallNotSupported pin the handler gate at exactly the half-wired state. - TestLarkOAuthErrorReason_APIClientNotConfigured pins the mapping for both the bare sentinel and the fmt.Errorf-wrapped form HandleCallback produces. - TestDefaultRendererConfigCarriesUpdateMulti covers every CardKind. - TestHTTPClient_(Send\|Patch)InteractiveCard_DefaultRendererBodyHasUpdateMulti verify the wire body Lark actually receives carries update_multi through both send and patch transport paths. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): real OAuth code exchange + agent-detail bind entry (MUL-2671) Stages the install side of the MVP critical path on top of the real HTTP outbound work: - httpAPIClient.ExchangeOAuthCode runs the production Lark v2 OAuth flow: POST /authen/v2/oauth/token to swap the authorization code for the installer's open_id, then GET /bot/v3/info under the parent app's tenant_access_token to fetch bot_open_id. Result feeds InstallationParams unchanged so OAuthService.HandleCallback's auto-bind step lights up automatically. - HTTPClientConfig gains OAuthAppID/OAuthAppSecret, read from the same MULTICA_LARK_OAUTH_APP_ID/_APP_SECRET env vars the OAuthConfig consumes. SupportsOAuthInstall now mirrors that pair so the install capability gate is honest: outbound transport without OAuth creds reports configured-but-not-install-supported, exactly like before. - Agent detail inspector wires the LarkAgentBindButton in a new Integrations section, viewer-hidden by canEdit. The button still self-hides when SupportsOAuthInstall is false, so a deployment without OAuth creds renders the section empty rather than CTA-broken. - Capability wording cleaned across handler / router / lark-tab to say "OAuth-install capability" instead of "real APIClient wired", and the misleading TransportOnly... test was renamed/refocused on the early-return branch it actually exercises (Elon non-blocking note). Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): identity-only OAuth + atomic bind (MUL-2671) Addresses Elon's round-4 must-fix items on PR #3277: 1. OAuth v2 token → user_info chain now matches Lark's official user-OAuth shape. `httpAPIClient.ExchangeOAuthCode` POSTs /open-apis/authen/v2/oauth/token (RFC 6749: top-level access_token, NO open_id), then GETs /open-apis/authen/v1/user_info with the user_access_token as Bearer to obtain the installer's open_id / union_id. The test fixture now reflects the real wire shape (separate user_info handler; no synthetic open_id in the token response). 2. `OAuthExchangeResult` is identity-only — drops the synthesized shared-parent AppID / AppSecret / BotOpenID return that broke the UNIQUE(app_id) constraint and the dispatcher's per-app_id routing. `OAuthService.HandleCallback` no longer Upserts an installation row: it looks up the lark_installation already provisioned via the manual-paste POST /lark/installations route and binds the installer onto it. Two new typed errors — ErrInstallationNotProvisioned and ErrInstallationRevoked — map to `installation_not_provisioned` / `installation_revoked` reasons at the HTTP boundary so the UI can guide the admin. The PersonalAgent install API (which would deliver per-installation bot credentials at scan time) remains a follow-up; until it lands the OAuth flow is identity-binding only and the agent-detail bind button stays hidden on deployments without OAuth env (capability gate unchanged). 3. The installation lookup + installer bind run inside a single DB transaction so a concurrent revoke / re-provision between the read and the binding insert cannot leak a half-applied state. `InstallerBinder.BindInstaller` is renamed to `BindInstallerTx` and accepts the OAuth-service-owned transaction's qtx; the binding_token redemption path is unchanged. 4. `validateExchangeResult` is simplified to require only the installer's open_id; the obsolete ErrExchangeMissingAppID / AppSecret / BotOpenID sentinels are removed (no caller can trip them now). The oauth_test suite is rewritten to use a stub failTxStarter so tests covering state-token verification and exchange-error propagation remain DB-free, while a new TestOAuthCallbackOpensTxAfterValidExchange pins the post-must-fix order (state ok + exchange ok ⇒ Begin runs before any lookup or bind, and a Begin failure aborts cleanly with no bind). Verified locally: - go build ./... / go vet ./... clean - go test ./internal/integrations/lark/... ✓ - go test ./internal/handler -run 'Lark\|Binding\|OAuth' ✓ - go test ./internal/util/secretbox/... ./internal/service/... ✓ Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): device-flow scan-to-install (MUL-2671) Replaces the manual paste-credentials install path + identity-only OAuth callback (rejected in product review: too many steps before a user sees value) with a true single-step scan-to-install built on Lark's RFC 8628 device-flow registration endpoint (POST accounts.feishu.cn/oauth/v1/app/registration) — the same protocol the official larksuite/oapi-sdk-go/scene/registration package and zarazhangrui/feishu-claude-code-bridge use. User journey: admin clicks "Bind to Lark" on the Agent detail page → QR dialog opens → admin scans in the Lark app on their phone → authorizes the new PersonalAgent → dialog auto-closes with the new installation visible. No app_id / app_secret to copy, no Lark developer console visit, no Multica-side OAuth env to configure. Backend (server/internal/integrations/lark): - registration.go — inline ~280-line RFC 8628 client. Begin posts archetype=PersonalAgent / auth_method=client_secret / request_user_info=open_id; Poll follows the upstream SDK's state machine including the tenant-brand mid-stream domain swap to accounts.larksuite.com when a Lark-international account authorizes. SDK is NOT vendored — one endpoint isn't worth dragging the full oapi-sdk-go + transitive deps. - registration_service.go — owns the in-process session store + background polling goroutine. On success calls APIClient.GetBotInfo (the new IM-side endpoint added below) and writes lark_installation + the installer's lark_user_binding inside one DB transaction so a half-applied install can never land. Stable error_reason codes (expired / access_denied / lark_protocol_error / bot_info_failed / installation_conflict / installer_bind_failed / internal_error) drive the UI copy without parsing prose. - client.go / http_client.go — drops ExchangeOAuthCode and SupportsOAuthInstall (no longer applicable: device-flow returns identity alongside credentials in one response); adds GetBotInfo which mints a tenant_access_token from the freshly-minted client_id / client_secret and calls /open-apis/bot/v3/info for the bot_open_id. install_supported now gates on IsConfigured() (real HTTP client wired) instead of a separate OAuth capability. - binding_token.go — absorbs InstallerBindParams / InstallerBinder (previously in oauth.go), retargets the doc-comment from the OAuth caller to the device-flow caller. - Deletes oauth.go + oauth_test.go entirely. Handler & router (server/internal/handler, server/cmd/server): - POST /api/workspaces/{id}/lark/install/begin — opens a new registration session, returns {session_id, qr_code_url, expires_in_seconds, poll_interval_seconds}. Admin-only. - GET /api/workspaces/{id}/lark/install/{sessionId}/status — polling endpoint, returns {status, installation_id?, error_reason?, error_message?}. Workspace-scoped lookup so a stolen session_id cannot be polled from another workspace. Admin-only. - Removes POST /lark/installations (paste form), GET /lark/install/start (OAuth-redirect entry), and GET /api/lark/install/callback (OAuth redirect target). - Removes MULTICA_LARK_OAUTH_APP_ID / _APP_SECRET / _REDIRECT_URI / _STATE_SECRET / _AUTHORIZE_URL / _SUCCESS_URL env vars. Self-host operators no longer need a parent Lark app at all. Frontend (packages/core, packages/views): - New types BeginLarkInstallResponse / LarkInstallStatusResponse + matching API methods (beginLarkInstall / getLarkInstallStatus); drops getLarkInstallURL. - LarkAgentBindButton opens LarkInstallDialog instead of a window.open() to Lark's authorize page. The dialog uses react-qr-code (catalog) to render the verification_uri_complete inline as SVG (no external CDN image), polls status at the server-supplied cadence, auto-closes on success, offers "scan again" on terminal failure. Per CLAUDE.md "Enum drift downgrades, not crashes", error_reason switch has a default fallback so an older desktop build on a newer server still renders the generic failure copy. - Adds the device-flow strings to en + zh-Hans settings.json; removes the obsolete OAuth-not-configured copy. Verified locally: - go build ./... / go vet ./... clean - go test ./internal/integrations/lark/... — all green (existing tests + 15 new registration / GetBotInfo tests) - go test ./internal/handler -run 'Lark\|Binding' — all green - pnpm typecheck — all 6 packages clean - pnpm lint — 0 errors (15 pre-existing warnings, none in changed files) - pnpm --filter @multica/views test — 859/859 pass Pre-existing failures in server/internal/middleware (column "profile_description" missing from local test DB) reproduce against the parent commit and are unrelated to this change. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): gate bind CTA to workspace admins, terminate QR polling on 4xx (MUL-2671) Two frontend must-fixes from the PR #3277 二审: 1. LarkAgentBindButton now self-hides for non-admin viewers in addition to the existing install_supported check. The agent-detail page mounts the button under `canEdit`, which canEditAgent lets agent owners through even when they are not workspace admins — but the backend gates POST /lark/install/begin and the status poll on owner/admin (router.go:478-487), so the previous behavior shipped a CTA that was guaranteed to 403. The new gate reads workspace role from the same member list the settings tab already uses. 2. The status polling loop now terminates on 404 (session gone — server restarted, multi-instance routing, or in-process GC swept it) and 403/401 (permission revoked mid-session). Previously every error path scheduled another setTimeout, which trapped the user on a stale QR forever. ApiError gives us the HTTP status verbatim; terminal responses set status=error with stable error_reason codes (session_lost, forbidden) that flow through the existing dialog switch + retry/close affordances. 5xx + network blips still retry. i18n: new install_error_session_lost / install_error_forbidden in en and zh-Hans, with default fallback preserved per the enum-drift rule. Coverage: 6 new vitest cases — admin/owner allow, member deny, unsupported-install deny, and the two terminal-error polling paths using fake timers to assert the loop stops scheduling. Also clears a handful of stale OAuth/manual-install doc comments flagged in the review (non-blocker cleanup): doc.go's §10 now points at RegistrationService, installation.go's input-shape doc loses the OAuth-callback half, and client.go's stubAPIClient comments no longer reference OAuth callbacks. Co-authored-by: multica-agent <github@multica.ai> * docs(integrations/lark): describe gate as device-flow install in agent-detail integrations comment (MUL-2671) The comment block above the agent-detail Integrations section still described the capability gate as 'server-side OAuth-install'. The OAuth path is gone — install is now device-flow per RFC 8628 — so the comment now reads 'server-side device-flow install capability gate'. Pure comment change; behavior is unchanged. Cleans up the nit Elon called out in PR #3277 二审 (MUL-2671). Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): wire inbound pipeline + WS Hub at boot (MUL-2671) Stage 3.a of MUL-2671. Hub class, Dispatcher, ChatSessionService and AuditLogger have all been implemented and tested in prior PRs but none of them was constructed at boot, so the in-process plumbing was never exercised end-to-end. This change wires them together behind the same `MULTICA_LARK_SECRET_KEY` gate that already gates InstallationService / RegistrationService, and starts the Hub under the existing `sweepCtx` so it winds down alongside the other long-running workers after HTTP drain. The real long-conn EventConnector is still pending; the factory hands every supervisor a shared NoopConnector that holds the lease and emits nothing. That lets staging exercise the lease / supervisor / shutdown lifecycle against real DB rows without committing to the Lark wire protocol implementation. Swapping in the real connector is a single line change in the same router block; the Dispatcher / ChatSessionService / Hub seams stay frozen. ## Why a noop placeholder, not a stub-or-skip The Hub's value is mostly its lifecycle: §4.4 ownership lease, LeaseRenewInterval / LeaseTTL, supervisor reap on revoke, clean release on shutdown. None of that runs unless the Hub is actually started. Holding off until the real connector lands means the next PR has to debut both pieces simultaneously; wiring the supervisor loop first lets the real connector PR be a focused, reviewable swap. ## Changes - `internal/integrations/lark/noop_connector.go` — `NoopConnector` implementing `EventConnector`: blocks on ctx until the Hub cancels (lease loss / shutdown / revoke), emits no events, logs on enter/exit so operators see exactly which installation the supervisor is holding the lease for. - `internal/integrations/lark/noop_connector_test.go` — verifies the connector blocks until ctx cancel, returns nil on clean exit, never invokes the emit callback, and the factory shares a single connector instance across installations. - `internal/handler/handler.go` — new `LarkHub lark.Hub` field on `Handler`. Nil when the Lark integration is disabled. - `cmd/server/router.go` — inside the existing Lark wiring block, construct `AuditLogger`, `ChatSessionService` (with `pgxpool.Pool` for the in-tx dedup Mark), `Dispatcher` (wiring `h.IssueService` and `h.TaskService` so `/issue`-created issues share counter / duplicate guard / project boundary / broadcast / analytics with the rest of the product), and the `Hub` with the `NoopConnectorFactory`. `NewRouterWithOptions` now returns `(chi.Router, handler.Handler)` so main.go can drive Hub lifecycle; `NewRouter` discards the handler. - `cmd/server/main.go` — start the Hub under `sweepCtx` after the other background workers, and `Wait` on it after HTTP drain + sweep cancel so the lease renewer can issue a final release before exit. Skipped entirely when `h.LarkHub == nil`. ## Test plan - [x] `go build ./...` clean - [x] `go vet ./...` clean - [x] `go test ./internal/integrations/lark/...` (new noop tests + existing hub / dispatcher / chat_service / registration / binding_token / outbound / issue_command suites) — all pass - [x] `go test ./internal/handler -run 'TestLark\|TestRedeemLarkBinding'` pass — handler-side Lark surfaces unchanged - [x] `go test ./internal/service/... ./internal/util/secretbox/...` pass - [x] `pnpm --filter @multica/views exec vitest run settings/components/lark-tab` pass (6/6) — frontend lark surfaces unchanged - [ ] Local broad `go test ./internal/handler/...` still blocked by the pre-existing test DB schema drift Elon flagged in the previous round (`column "metadata" does not exist`, unrelated to this change); CI is the authoritative check. - [ ] Manual end-to-end deferred until the real long-conn EventConnector lands (next stage). MUL-2671 Co-authored-by: multica-agent <github@multica.ai> fix(integrations/lark): bound Hub lease release + shutdown wait (MUL-2671) Lease release used context.Background(); a stalled DB pool could pin shutdown indefinitely. Add LeaseReleaseTimeout (5s default) and ShutdownTimeout (15s default) to HubConfig, route releaseLease through a bounded context, and expose WaitWithTimeout for main.go so a wedged supervisor degrades to LeaseTTL expiry on the next replica instead of blocking process exit. Also correct the LarkHub field comment in handler.go: the Hub is wired whenever the at-rest secret key is set, independent of whether the outbound HTTP APIClient is configured. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): real WS long-conn connector + ctx-cancel-breaks-read (MUL-2671) Replaces NoopConnectorFactory with a production EventConnector that opens Lark's event-subscription WebSocket. Gated behind MULTICA_LARK_WS_ENABLED so staging boots stay on the noop path until operators opt in, and falls back to noop with a warning when the WS flag is set without MULTICA_LARK_HTTP_ENABLED (the real connector needs the cached tenant_access_token). Why this connector exists separately from the Hub: gorilla/websocket ReadMessage blocks on the underlying TCP socket and does not observe context. The watchdog goroutine inside WSLongConnConnector.Run closes the conn the moment ctx fires, so lease loss / shutdown breaks the blocking read in bounded time — exactly the invariant Hub renewLeaseUntil's runCancel depends on for the "at most one active WS per installation across replicas" guarantee. Tests cover this explicitly (TestWSConnectorRunReturnsOnCtxCancelEvenWhenReadIsBlocked). The Lark wire surface is split into three swappable seams so the transport layer stays tested in isolation: - EndpointFetcher (POST /event-subscription/v1/connection_token) resolves a one-shot wss URL per Run. No caching — replaying a one-shot token would look like a Lark outage. - FrameDecoder turns one raw JSON envelope into an InboundMessage or a "control / heartbeat / drop" verdict. Decoder errors log + drop the frame; they do NOT tear down the connection. - CredentialsProvider wraps InstallationService.DecryptAppSecret so plaintext app_secret lives in memory only during a Run. Also fixes the handler.go LarkHub comment: it still said "joins on Wait during graceful shutdown" but main.go has used WaitWithTimeout (bounded wait) for several commits. Comment now matches. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): align WS to official binary Frame protocol + DispatchResult outbound replies (MUL-2671) Two must-fix items from Elon's review of PR #3277: 1. WS protocol layer rewritten to match the official Lark Go SDK (`larksuite/oapi-sdk-go/v3/ws`): - Bootstrap is `POST /callback/ws/endpoint` with AppID/AppSecret in the body (no tenant_access_token bearer). Response carries wss URL + ClientConfig (PingInterval / ReconnectInterval / ReconnectNonce / ReconnectCount). - `service_id` is parsed from the wss URL query and used as Frame.Service on every outbound frame. - Wire envelope is the binary protobuf `pbbp2.Frame` (hand-rolled via protowire to avoid pulling the whole SDK in, byte-identical field tags). JSON payloads are nested inside Frame.Payload. - Inbound data frames are ACKed with a `Response{code:200,...}` JSON payload that reuses the inbound headers; infra failures produce code=500 so Lark retries. - Ping is the app-layer binary `NewPingFrame(serviceID)` at the server-supplied cadence; WebSocket protocol PING is removed (Lark ignores it). Server-initiated pings get a pong reply. - ctx-cancel-breaks-read invariant preserved via the watchdog goroutine that closes the conn on ctx.Done; the read loop and ping goroutine serialize their writes through a single mutex. 2. `DispatchResult` outbound replies wired via a new `OutcomeReplier`: - `OutcomeNeedsBinding` mints a one-shot binding token and sends the binding prompt card to the sender's open_id. - `OutcomeAgentOffline` / `OutcomeAgentArchived` push a notice card into the chat with the agent name + Chinese copy matching §4.6. - `OutcomeIngested` stays owned by the Patcher; `OutcomeDropped` is silent. - The replier is best-effort: outbound failures are logged and swallowed so a Lark outage cannot stall the inbound pipeline. - Hub installs the noop replier by default; router wires the production `LarkOutcomeReplier` when APIClient.IsConfigured(). PersonalAgent long-conn risk surfaced (open per Feishu docs: `长连接模式仅支持企业自建应用`). The implementation works for any app archetype; the open question is whether `/callback/ws/endpoint` accepts PersonalAgent credentials in practice. Surfacing the Lark code+msg verbatim from the bootstrap response so an operator running the smoke test sees the exact failure rather than a generic timeout. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): byte-compat Frame marshal, chunk reassembly, ACK off reply critical path (MUL-2671) Three protocol blockers from Elon's review of `9540008a`: 1. Frame.Marshal is now byte-identical to oapi-sdk-go/v3/ws/pbbp2.Frame: - SeqID/LogID/Service/Method (proto2 req) emit unconditionally even at zero - PayloadEncoding/PayloadType/LogIDNew emit unconditionally per gogo generated MarshalToSizedBuffer (no zero-guard) - Payload uses the SDK's `!= nil` guard (nil omits, []byte{} emits 0-length) - ACK payload JSON matches SDK's NewResponseByCode + json.Marshal output ({"code":N,"headers":null,"data":null}) Golden tests pin exact byte sequences for ping/pong/ACK/full/zero frames; verified against the real SDK pbbp2.pb.go MarshalToSizedBuffer producing identical bytes. 2. Multi-frame events (sum>1) are reassembled via the new chunkAssembler: - 5s sliding TTL (matches SDK combine() cache TTL) - Lazy GC on admit (no separate sweeper goroutine) - Out-of-order seq + duplicate seq idempotent - Partial chunks are NOT ACKed (SDK behaviour: only the final chunk's ACK confirms the whole event so Lark can retry on partial loss) - Connector wires assembler per-Run; state dies with the session 3. OutcomeReplier detached from ACK critical path: - HubConfig.ReplyTimeout default 2.5s, strictly under Lark's 3s ACK deadline - handleEvent dispatches synchronously (fast DB path), then spawns the replier under a fresh background ctx with WithTimeout(ReplyTimeout) - Hub.replyWg tracks in-flight replies; Hub.Wait / WaitWithTimeout drain them so shutdown is bounded - Noop replier short-circuits inline (no goroutine cost when outbound APIClient isn't configured) Proof tests: - TestHubScheduleReplyReturnsImmediately: scheduleReply with a 10s slow replier returns in <50ms - TestHubReplyTimeoutCancelsHungReplier: hung replier ctx fires at ReplyTimeout - TestHubWaitDrainsInFlightReplies: Wait blocks until replies finish - TestHubACKNotBlockedByOutboundReply: end-to-end through the connector — data-frame ACK lands within 500ms even when the replier hangs 5s PersonalAgent real-env smoke remains Bohan's decision; this PR closes the technical blockers Elon flagged. Co-authored-by: multica-agent <github@multica.ai> * docs(service/issue): narrow position concurrency claim to create-create (MUL-2671) Elon's review of the merge resolution flagged that the comment on the new NextTopPosition call promised more than the code guarantees: concurrent manual reorder via UpdateIssue(position) does NOT take the workspace row lock that IncrementIssueCounter holds, so a create racing a reorder can still land on the same position. Rewrite the comment to only claim create-create serialization, which is the behaviour the lock actually delivers. No code change. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): keep device-flow polling on RFC 8628 HTTP 400 (MUL-2671) Lark's device-flow polling endpoint returns HTTP 400 with the JSON body `{"error":"authorization_pending"}` while the user hasn't scanned the QR yet — this is the RFC 8628 spec, and the upstream oapi-sdk-go implements the same handling. Our previous doForm treated ANY non-2xx as a terminal protocol error, so every install session was killed by the first poll (~5s after begin) and the install dialog appeared silently empty: the frontend received status=error + lark_protocol_error before the user could even read the description. Fix: doForm now decodes the JSON body first; if it parses, the caller (Begin / Poll) routes on the body's `error` field, where the existing switch correctly maps authorization_pending / slow_down to "keep polling" and access_denied / expired_token to terminal failure. Only unparseable bodies (5xx HTML proxy pages, gateway timeouts) still surface as a typed http_NNN RegistrationError. Three regression tests pin the new behaviour: - HTTP 400 + authorization_pending → res.Status="authorization_pending" - HTTP 400 + access_denied → res.Err.Code="access_denied" (terminal) - HTTP 502 + HTML body → http_502 RegistrationError Verified against the live local env: install/begin -> 200, status stays "pending" through the first poll cycle, no longer flips to "error" within seconds. Co-authored-by: multica-agent <github@multica.ai> * fix(views/lark): reset closedRef on every mount so StrictMode double-mount renders QR (MUL-2671) Empty QR dialog body in the dev env: Bohan opened the bind dialog and got an empty white area where the QR should have been — no QR, no "starting" placeholder, no error text. Backend was returning the QR URL correctly; the bug was on the frontend. Root cause: React 19 / Next.js dev StrictMode mounts every component twice (mount → cleanup → mount). The component instance is REUSED across the simulated remount, which means useRef objects are preserved. The dialog's `closedRef` lifecycle: 1. Mount #1: closedRef={current:false}, beginSession() kicked off (HTTP request still in flight) 2. Cleanup runs: closedRef.current=true 3. Mount #2: beginSession() kicked off again, BUT the ref still reads {current:true} from step 2 4. Both promises resolve. Both hit the post-await guard `if (closedRef.current) return;` and bail out before setSession(). 5. Result: session stays null forever. Every conditional in the dialog body (beginning/session-pending/success/error) is false → empty body. Fix: reset closedRef.current=false at the START of the effect, not just at component construction. The cleanup-then-mount pair now re-arms the guard so subsequent setSession calls actually land. Regression test wraps the dialog in <StrictMode> and asserts the QR appears within 2s with the correct value — fails closed if anyone removes the reset. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): drop EventTaskCompleted subscription so the chat reply doesn't get overwritten by "Done." (MUL-2671) Bohan reproduced on the live dev env: agent replies show only a card saying "Done." in Lark, even though Multica's own chat panel has the real "Hello! I'm cc…" reply. Tasks succeed end-to-end, but the user loses the reply on the Lark side. Root cause: TaskService.CompleteTask publishes two events for every chat task IN ORDER: 1. broadcastChatDone(...) → ChatDonePayload{Content: "Hello!..."} 2. broadcastTaskEvent(Completed) → map[string]any{task_id, agent_id,...} (no `content` key) The Patcher subscribed to BOTH and routed each to finalize(). The first patch correctly rendered the reply text, the second patched the same card with an empty payload — chatDoneContent() returned "" and the renderer fell back to "Done." (default empty-body copy). The second patch wins because Lark stores whatever was last applied. Fix: stop subscribing to EventTaskCompleted in the Patcher and remove the corresponding switch arm. EventChatDone is the canonical "agent finished replying" signal for the Lark card path; EventTaskCompleted is still emitted to the bus for other listeners (web UI, analytics, task usage) where the lack of content doesn't matter. Regression test TestPatcherIgnoresEventTaskCompletedForChatTasks emits ChatDone followed by TaskCompleted on a streaming card and asserts: exactly one patch, body contains the agent reply, body does NOT contain "Done.". If anyone re-adds the EventTaskCompleted subscription, this fails immediately. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): chat replies as plain text IM messages, not card chrome (MUL-2671) Bohan reported on the live dev env that even with the agent's reply shown correctly, every message is wrapped in an interactive card with the agent name as the header — it feels like a system notification, not a normal chat reply. He wants the reply to land as a regular Lark text bubble. Changes: - Add APIClient.SendTextMessage backed by Lark's /open-apis/im/v1/messages with msg_type=text. JSON-encodes the {"text": ...} envelope Lark requires so callers pass raw strings. - Patcher.Register no longer subscribes to EventTaskQueued / EventTaskRunning. There is no more thinking → running → final card lifecycle on the success path: it added card chrome without buying anything for free-form chat. - On EventChatDone, the new sendChatReply path posts the assistant message content as plain text. Empty content is silently dropped rather than rendered as "Done." (the prior fallback that confused Bohan). - Failure path keeps a one-shot error card on EventTaskFailed — the visual distinction from a normal reply is genuinely useful, and failures are rare enough that the chrome isn't noisy. - Throttle / lastPatched map / MinPatchInterval / shouldPatch / markPatched / loadCardOrSkip are all removed; nothing in the new flow patches. Tests: - TestPatcherSendsPlainTextOnChatDone pins the new contract: exactly one SendTextMessage call, no card sends or patches, content matches the ChatDonePayload. - TestPatcherDropsEmptyChatReply pins the "no more Done. fallback" decision — empty content drops, period. - TestPatcherFailEventSendsErrorCard pins the failure path still uses a card (one-shot, no patching). - TestPatcherIgnoresEventTaskCompletedForChatTasks rewritten for text path: ChatDone then TaskCompleted yields exactly one text send, no duplicate. - TestPatcherSkipsWhenNoChatSessionBinding and TestPatcherSwallowsInstallationLoadErrors rewritten to drive EventChatDone (the new entry point) instead of TaskQueued. - TestPatcherSendsThinkingCardOnTaskQueued deleted (no more thinking card). Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): pre-fill PersonalAgent bot name as "<agent> - Multica" (MUL-2823) (#3520) The device-flow install left the bot at Lark's auto-generated "{用户姓名}的智能助手". Lark's registration scene supports pre-filling the name via a `name` query param on the verification/QR URL (mirrors the upstream SDK's AppPreset.Name) — a user-editable default that rides on the QR URL, not the begin POST body (which has no name field). BeginInstall already loads the agent for its ownership check, so we keep it and thread `<agent.Name> - Multica` through Begin → decorateQRCodeURL. A blank name degrades to plain "Multica". There is no post-install rename API (bot/v3 is read-only; no bot/v3/update), so the install-time pre-fill is the only programmatic lever; the user can still edit the name on the creation form. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): restore /issue confirmation + pin SendTextMessage wire (MUL-2671) Two recovered/added contracts off Trump's review of HEAD `fe381a07`: 1) /issue confirmation in Lark was a casualty of the plain-text refactor. The pre-refactor `RenderInput.IssueNumber` field was declared but never actually rendered into the card body, so even in the original card-based flow the user never saw a "Created [MUL-42]" confirmation. Now the OutcomeReplier handles OutcomeIngested + IssueID.Valid by sending a plain text message: Created MUL-42 — fix login bug https://multica.example/issues/MUL-42 Composed from a new DispatchResult.IssueIdentifier + IssueTitle, populated by the Dispatcher from workspace.IssuePrefix + issue.Number / issue.Title. Workspace lookup is best-effort: a Postgres blip on workspace gets a "#42" fallback rather than silently dropping the confirmation. The agent's own chat reply (if any) continues to land separately via the Patcher on EventChatDone — these are two semantically distinct messages and the user benefits from seeing both. 2) SendTextMessage is the wire layer Trump flagged for missing coverage. Three new wire tests pin: - happy path: POST /open-apis/im/v1/messages?receive_id_type=chat_id, msg_type=text, Bearer <tenant_access_token>, double-JSON content envelope - special-character round trip: newlines, double quotes, backslashes, tabs, Chinese + emoji, JSON-lookalike strings. The inner {"text": ...} is encoded once at JSON.Marshal time and once again when the outer body serializes; losing either pass corrupts the message and the bug is invisible without a contract pin. - Lark error path: non-zero `code` surfaces as a wrapped error with the code embedded. Tests: - TestDispatcher_IssueCreationFromCommand asserts IssueIdentifier ("MUL-42") and IssueTitle propagate through DispatchResult. - TestDispatcher_IssueIdentifierFallsBackToNumberOnWorkspaceLookupErr pins the "#7" degrade-graceful fallback. - TestLarkOutcomeReplierIssueCreatedSendsConfirmation pins the text body (identifier + title + deep link) and asserts no card send on this path. - TestLarkOutcomeReplierOutcomeIngestedSilentWithoutIssue pins the silent-on-plain-chat default so we don't accidentally start emitting a confirmation for every message. - TestHTTPClient_SendTextMessage_* covers the wire contract. Frontend locale parity (en + zh-Hans, 53 tests) is currently green on this HEAD; no changes needed. Co-authored-by: multica-agent <github@multica.ai> * fix(views/locales): add missing ko keys for Lark MVP (MUL-2671) Trump flagged on PR #3277 review that the ko bundle was missing the Lark-MVP-only keys that en + zh-Hans both carry. The parity test caught it cleanly after main was merged in (Korean PR landed on main between the prior review and this one): common.lark_bind.* (13 keys) settings.page.tabs.lark (1 key) settings.lark.* (45 keys) agents.inspector.section_integrations (1 key) Korean translations are professional/concise — "Lark" stays as the brand name (matches how en keeps "Lark" + "(飞书)" parenthetically; ko/users searching for the product expect "Lark"), and product copy follows the zh-Hans tone where Multica nouns ("에이전트", "워크스페이스") are romanized loan words consistent with the rest of the ko bundle. Slot ordering preserved against EN: - page.tabs.lark sits between github and integrations - inspector.section_integrations sits right after section_skills Verified: pnpm exec vitest run locales/parity → 105/105 pass. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): /issue origin_type CHECK + Hub restart on credentials rotation (MUL-2671) Two live-env bugs Bohan reproduced: 1) /issue command crashed the WS connector. Dispatcher writes origin_type='lark_chat' on issues born from `/issue`, but the issue_origin_type_check CHECK constraint was last extended in migration 060 for quick_create — it doesn't list lark_chat, so every Lark /issue tripped SQLSTATE 23514 and bubbled up as an infra error. The infra error tore down the WS connector, Lark retried the same message, the new connector tripped the same constraint and crashed again. Repro in the live env: three crashes from the same /issue event over ~40s, each leaving the user with no confirmation in Lark. Migration 111 extends the CHECK list: CHECK (origin_type IN ('autopilot', 'quick_create', 'lark_chat')) 2) Re-scanning an already-bound agent silenced the bot. The device flow re-registers with Lark, which mints a brand-new bot (fresh app_id + app_secret); RegistrationService.finishSuccess upserts into lark_installation by agent_id, so the row's credentials rotate in place. But the running supervisor held the OLD inst struct by value and kept a WS open against the OLD bot's app_id — so all events to the NEW bot went nowhere. Bohan's "claude code 现在不能在飞书里回复了" symptom maps exactly to this: log timeline: 16:29:57 cc connector connected with app_id=cli_aa9398dd... (OLD) 16:34:07 lark registration: install complete (rotation) → row.app_id is now cli_aa93f36f... (NEW) → old WS still subscribed to OLD app_id; new app_id receives nothing Fix: Hub.sweep now compares each installation row's credentials fingerprint (app_id + bot_open_id + sha256(app_secret_encrypted)) against the snapshot the running supervisor was started with. On diff, cancel the old supervisor and start a fresh one inline. A monotonic gen counter on the supervisor entry disambiguates the old goroutine's deferred cleanup from the new entry the rotation path already swapped in. Tests: - TestHubRestartsSupervisorOnCredentialsRotation pins the new path: starts hub on app_one, rotates the row to app_two, asserts the connector factory is called again with the fresh AppID. - TestHubDoesNotRestartSupervisorOnUnchangedRow pins the negative case so an unchanged row doesn't degenerate into a per-sweep busy-loop. - Existing hub tests (lease, supervise, shutdown, ACK timing, noop replier) all green. Verification: - go test ./internal/integrations/lark/... -race -count=1 ok - go build ./... clean - migration applied locally; \d+ issue confirms lark_chat in CHECK Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): per-supervisor lease token to fence rotation handoff (MUL-2671) Elon flagged a race in HEAD be8d4cef's rotation path: both the old and the new supervisors of the same Hub used the hub-wide nodeID as their WS lease token, so an old supervisor's post-cancel releaseLease(nodeID) would CAS-match the lease row the successor had just acquired with the SAME token and DELETE it. Symptom would be a silently empty lease row a few hundred ms after every device-flow re-scan — no replica owning the install, no events delivered, the "bot goes quiet" pattern Bohan hit the first time but now from the fencing side rather than the credentials side. Fix: leaseToken(nodeID, gen) composes "<nodeID>-g<gen>", where gen is the monotonic counter already attached to each supervisorEntry. The nodeID prefix keeps cross-replica observability (an operator inspecting lark_installation.ws_lease_token can still map back to a process) while the -g suffix makes the OLD supervisor's release target the OLD row state. Once the rotation path swaps in the new supervisor, the row's CurrentToken is the new -g(N+1) token, so the old -gN release's WHERE clause no-ops instead of clobbering. acquireLease / renewLeaseUntil / releaseLease now take an explicit token argument; supervise threads its leaseToken through. The plumbing isn't pretty, but having an explicit argument at every call site is the only way the rotation invariant survives subsequent refactors — without it, a future caller could quietly reintroduce "just use h.nodeID" and the race is back. Two regression tests: - TestHubRotationStaleReleaseDoesNotClearSuccessorLease drives the fake lease state machine directly: 1. old acquires(tokenA) 2. rotation lands; new acquires(tokenB) 3. old's stale release(tokenA) fires Asserts owner ends up still tokenB. Hub-wide-nodeID code would fail step 3 by clearing the entry. - TestHubRotationEndToEndKeepsSuccessorLeased runs the same scenario through the live supervise loop: starts hub, rotates the row, waits for sup2 to take over with a distinct token, sleeps past sup1's unwind, asserts the row is still held by a non-sup1 token. Catches the bug even when the goroutine timing is non-deterministic. Verification: go test ./internal/integrations/lark/... -race -count=1 ok go build ./... clean go vet ./... clean Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): route group @-mentions via union_id, not open_id (MUL-2671) In a Lark group with multiple Multica bots installed, the bot whose WS received the event sometimes failed to recognize that it was the @-target while the OTHER bot's supervisor falsely fired. Bohan's controlled three- message test (only @A, only @B, @both) hit this: @A and @B alone went unanswered, @both got picked up by A only. Root cause: the `mentions[].id.open_id` field Lark puts on the WS event is structurally INVERSE to `/bot/v3/info`'s `bot.open_id` across the two WSes. From A's WS perspective, the wire-form open_id for "A was @-ed" is NOT equal to A's API-side open_id, but IS equal to what B's WS sees on its side, and vice versa. The decoder's `mention.open_id == inst.BotOpenID` match therefore fires on the wrong bot in multi-bot groups. Only `union_id` (the Lark-tenant-scoped stable identifier) is consistent across both WSes. Changes: - migration 112 adds nullable `lark_installation.bot_union_id` - sqlc query exposes UpsertLarkInstallation/CreateLarkInstallation with bot_union_id, plus a focused SetLarkInstallationBotUnionID for the backfill path - httpAPIClient.GetBotInfo now follows /bot/v3/info with /contact/v3/ users/{open_id}?user_id_type=open_id and returns both identifiers on BotInfo. Soft-fails on contact-scope denial: install still succeeds with an empty UnionID, and the decoder falls back to the legacy open_id match for single-bot deployments. - RegistrationService.finishSuccess persists union_id alongside open_id during the device-flow finalize. - ws_frame_decoder.containsMention prefers union_id and only walks open_id when the installation row has not been backfilled yet. - BackfillBotUnionIDs runs once at server boot for installations created before migration 112; bounded per-row 10s timeout and a pure soft-fail policy so a slow Lark round-trip cannot block startup. - regression tests cover the three decoder paths: union_id match wins over open_id mismatch, union_id mismatch overrides open_id match, and open_id fallback when union_id is unknown. Co-authored-by: multica-agent <github@multica.ai> * chore: drop trailing blank lines at EOF on four files (MUL-2671) git diff --check origin/main..origin/pr-3277 flagged these as new blank lines at EOF; clearing so the diff stays clean for review. Co-authored-by: multica-agent <github@multica.ai> * fix(views/locales): add missing ja keys for Lark MVP + section_integrations (MUL-2671) CI frontend job tripped on the ja locale parity check: ja is missing the lark_bind block in common.json, the lark block + page.tabs.lark in settings.json, and inspector.section_integrations in agents.json. The ko fix earlier covered Korean; ja was added separately on main and the merge surfaced these gaps. Translations mirror the en source and follow the same voice as the existing ja bundle. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): rewrite @_user_N placeholders into clean body (MUL-2671) When Lark dispatches a group `im.message.receive_v1`, the message text contains opaque `@_user_1`, `@_user_2`, … placeholders and the real identity is in `mentions[]`. We were forwarding the raw text to the agent, so a Bohan-typed "@Bot ping test" arrived as "@_user_1 ping test" — neither human-readable nor useful as LLM context, and the agent was paying tokens to figure out which `@_user_N` was even itself. The new resolveMentions pass: * strips the bot's own mention entirely (the dispatcher already routes the event on AddressedToBot; re-emitting @<self> in front of every message adds zero signal and pollutes context), * substitutes other participants with `@<displayName>` so a follow-up "@Alice" reads naturally, * collapses horizontal whitespace introduced by the strip while preserving original newlines. Bot identity check uses the same union_id-preferred + open_id fallback as containsMention, so the rewrite stays consistent with the routing path. Tests cover the four shapes: bot self-mention, mixed bot + other-user mention, multi-line body with stripped mention, and a no-mention body that should be left untouched. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): union_id-first self mention strip + token-aware scan + local whitespace cleanup (MUL-2671) Three review blockers on the mention rewrite from PR review: 1. isBotMention now mirrors containsMention's union_id-first policy. When the installation row knows our union_id, we trust it exclusively (open_id is structurally inverted in multi-bot groups — matching on it would re-introduce the routing bug we fixed two commits ago). open_id fallback fires only when union_id is absent. New tests: @-ing both bots in one message correctly strips only self and renders the sibling as @<name>; open_id-matches-but-union_id-differs does NOT strip. 2. resolveMentions no longer collapses or trims whitespace globally. Indentation, tabs, code blocks, tables — all preserved verbatim. When the self mention is removed we eat exactly one adjacent horizontal space (the one after the placeholder, or, when the mention sits at end-of-input, a single space already emitted right before it). New test exercises a multi-line indented + tabbed body and asserts the whole shape survives. 3. Prefix-collision-safe replacement. A chat with 11+ participants exposes both `@_user_1` and `@_user_10`; naive ReplaceAll for `@_user_1` would mangle the substring of `@_user_10`. The resolver now does a single-pass token scan with the mention list sorted longest-key-first, so the longer placeholder always wins at any scan position. New test covers the @_user_1 / @_user_10 case explicitly. Also drops the temporary INFO-level diag logging the previous commit added — root cause was confirmed (union_id swap in the manual backfill; not a decoder bug). Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): scope inbound dedup per (installation_id, message_id) (MUL-2671) Root cause of the residual "@Cc gets dropped as not_addressed_in_group" even after the union_id swap landed: lark_inbound_message_dedup was keyed on `message_id` alone. In a Lark group chat where the workspace has multiple Multica bots installed, Lark delivers the SAME message_id to every bot's WS supervisor. Whichever WS claimed first then ran its own AddressedToBot check; the bot that was actually @-ed lost the dedup race, found the row already terminal (`processed_at IS NOT NULL`), and was dropped as `duplicate` BEFORE it could evaluate its own mention. Net: every @ silently disappeared if Lark happened to route the OTHER bot's WS first. The dedup gate's original purpose (idempotency against WS reconnect replay) is per-installation by definition, so the right key is composite (installation_id, message_id). Changes: - migration 113 drops + recreates lark_inbound_message_dedup with installation_id NOT NULL REFERENCES lark_installation(id) ON DELETE CASCADE and PRIMARY KEY (installation_id, message_id). The table is a 24h transient cache, so dropping existing rows is safe. - sqlc queries: ClaimLarkInboundDedup / MarkLarkInboundDedupProcessed / ReleaseLarkInboundDedup all now take installation_id. - AppendUserMessageParams carries InstallationID through to the in-tx Mark call so the chat_message+dedup atomicity stays intact. - Dispatcher passes inst.ID to claim + applyFinalize + AppendUserMessage. - Test fakes key dedup state on (installation_id, message_id) via a composite map key; all existing pre-seeded rows use a seedDedupKey helper bound to the default activeInstallation fixture so the prior staleness / token-rotation / in-tx mark tests still exercise the same regression they did before. - New regression TestDispatcher_DedupIsScopedPerInstallation pins the multi-bot invariant: a row pre-seeded for installation A does NOT block installation B's first delivery of the same message_id; B runs through its own group-filter / identity / ingest pipeline. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): render markdown chat replies via schema-2.0 card (MUL-2671) The agent's chat replies were going out as msg_type=text, so every `bold`, fenced code block, list, table, and link in the body showed up as literal markdown characters in Lark — the user saw raw asterisks, hashes, pipes instead of formatted text. Bohan reported this and pointed at zarazhangrui/lark-coding-agent-bridge as the shape to emulate. The bridge repo uses Lark interactive cards with the schema-2.0 envelope and a `tag: "markdown"` body element; Lark's client renders that to formatted text (GFM-ish: bold/italic, headings, lists, links, fenced code blocks, tables, blockquotes). They expose multiple reply modes (card / markdown-as-post / text) gated by user config; we go a step simpler — auto-detect markdown syntax in the agent's body and route accordingly: - containsMarkdown(): cheap substring + regex pass for fenced code blocks, headings, list markers, bold/italic, tables, links, blockquotes, horizontal rules, inline code. Biases toward false- positive — wrapping prose in a card still renders fine, but missing a real markdown block leaves raw characters visible. - APIClient gains SendMarkdownCard / SendMarkdownCardParams. Implementation marshals the schema-2.0 envelope verbatim: {schema:"2.0", body:{elements:[{tag:"markdown", content: md}]}}. Stub returns ErrAPIClientNotConfigured. - Patcher.sendChatReply now branches on containsMarkdown: markdown → SendMarkdownCard, plain prose → SendTextMessage. A one-liner "sure, on it" stays as a normal IM bubble (no card chrome); anything with markdown gets the rendered card. Tests: TestContainsMarkdown pins the heuristic across plain prose and ten markdown shapes; TestPatcherRoutesMarkdownReplyToCard and TestPatcherRoutesPlainReplyToText cover the router; new HTTP wire test TestHTTPClient_SendMarkdownCard_HappyPath contract-pins the card envelope (msg_type=interactive, schema 2.0, markdown tag, verbatim body). Full lark suite passes. Co-authored-by: multica-agent <github@multica.ai> * fix(service/issue): route analytics.IssueCreated through obsmetrics.RecordEvent (MUL-2671) CI's TestNoNakedAnalyticsCaptureInHandlersOrServices guard caught the post-merge analytics call in IssueService.captureCreatedAnalytics that still used s.Analytics.Capture(...) directly. Main added that lint to prevent the Prometheus and PostHog sides from drifting — any new analytics.* event must go through obsmetrics.RecordEvent so the business-metrics collector and the PostHog client fire from the same call site. Fix mirrors how TaskService handles it: IssueService gains a Metrics obsmetrics.BusinessMetrics field (router wires it via h.IssueService.Metrics = opts.BusinessMetrics next to the existing TaskService line), and the in-service Capture call becomes obsmetrics.RecordEvent(s.Analytics, s.Metrics, ...). nil-safe by construction — RecordEvent treats a nil Metrics as PostHog-only. Co-authored-by: multica-agent <github@multica.ai> feat(views/lark): swap Bind CTA for Connected+Manage link when agent already has an installation (MUL-2671) Bohan reported the agent-detail Bind button keeps inviting the user to re-scan the QR even when the agent already has an active Lark PersonalAgent connected — and re-scanning silently upserts the installation row, leaving the previously-created Lark bot dangling as a zombie. Frustrating UX and an actual product footgun. Anti-zombie guard at the only entry point: LarkAgentBindButton now checks the cached installations listing for an active row pinned to this agent_id. When one exists, the install CTA is gone — replaced by a small Connected pill + an "Manage in Lark" link that opens the Bot's app page in Lark's developer console (open.feishu.cn/app/<app_id>) in a new tab. That's where scopes, display name, and additional permission requests actually live; re-scanning never was the right answer for managing an existing bot. Scoping is per-agent: an active installation on a DIFFERENT agent in the same workspace doesn't affect this agent's button, and a revoked installation falls back to the bind CTA so the user can re-create. Tests cover all four states (own-active / own-revoked / other-agent-active / no-installation) and pin the Manage link's href + target=_blank + noopener. i18n: three new keys in settings.json (en / zh-Hans / ja / ko): agent_bot_connected_label, agent_bot_manage_link, agent_bot_manage_tooltip. Locale parity test still 157/157. The dev console host is hardcoded to open.feishu.cn — operators on the Lark international tenant currently get the wrong host; future-proof fix wants the backend to surface a per-installation dev_console_url on the listings response, called out in a code comment. Co-authored-by: multica-agent <github@multica.ai> * feat(views/settings): collapse Lark into Integrations + render agent identity (MUL-2671) Lark was its own top-level workspace settings tab while Integrations sat empty next to it. As more integrations land, the sidebar would balloon with one tab per provider. Move the Lark surface into Integrations as the first hosted integration; the old ?tab=lark URL redirects through LEGACY_WORKSPACE_TAB_REDIRECTS so bookmarks still resolve. The Connected bots list was leaking the raw Lark app_id (cli_…) as the row title with bot_open_id (ou_…) underneath — meaningless to product users. Since the binding is 1:1 with a Multica Agent, join on agent_id and render the agent's avatar + name via the workspace-standard ActorAvatar + useActorName.getAgentName. Deleted agents fall back to "Unknown Agent" so the row is still actionable for cleanup. Tests: stub useActorName + ActorAvatar in lark-tab.test.tsx and add LarkTab connected-bot tests covering the agent identity render and the deleted-agent fallback. Drop the now-dead integrations.* + page.tabs.lark + lark.bot_open_id_label keys across all four locales — parity still 157/157, views suite 1141/1141. Co-authored-by: multica-agent <github@multica.ai> * feat(views/settings): wrap Lark in a named section inside Integrations (MUL-2671) Integrations is meant to host multiple providers (Slack, Linear etc. as they land), so the Lark content should sit under a Lark heading rather than fill the tab directly — otherwise the first additional integration would feel like it broke the IA. Add a "Lark" / "飞书" section heading above LarkTab using the same h2 chrome the other settings tabs use, and pin lark.section_title across all four locales (parity 169/169). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: J <j@multica.ai>	2026-06-03 19:12:14 +08:00
Naiyuan Qing	1544e3b68a	feat(skills): built-in agent skills (WIP) MUL-2759 (#3456 ) * feat(skills): introduce built-in agent skills (WIP) Inject platform-authored, version-bundled skills into every agent on top of its workspace-bound skills, so agents learn how to operate Multica correctly without users needing to know the internals or agents needing to read source. Mechanism: skills are embedded into the server binary and appended to the agent payload at task-claim time (handler/daemon.go), reusing the existing SkillData wire + daemon-side writeSkillFiles. The daemon needs no changes, and because it travels over an existing wire field, older daemons pick the skills up the moment the server ships. First skill: multica-mentioning — how to build a working @mention (look up the UUID, match type to id source, know what each mention type triggers). WIP: injection mechanism + first skill only; more skills to follow in dependency order (skill -> agent -> squad). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(skills): make multica-mentioning the standard template + add eval Add the contract-skill frontmatter the other built-in skills will copy: user-invocable:false (it triggers from context, not as a slash command) and allowed-tools fencing it to the multica CLI it teaches. These keys survive to agent machines untouched (ensureSkillFrontmatter only ever adds a missing name). Add a Go eval in builtin_skills_test.go (a _test.go so it never ships to agent machines via the skill-files walk): - Enforces the template invariants on every built-in skill, present and future: multica- prefix, name+description present, description within 1024 chars, body within the 500-line L2 budget, no eval file leaking into the shipped payload. - Couples the mentioning skill's documented contract to the real util.ParseMentions: its Incorrect examples must parse to nothing (a name where a UUID belongs fails silently) and its Correct example must fire. A drift in the mention regex now breaks CI instead of silently turning the skill into a lie agents act on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add working-on-issues built-in skill Co-authored-by: multica-agent <github@multica.ai> * feat(skills): verify linked PRs in issue workflow skill Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add skill import and discovery built-ins Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add skill authoring built-in Co-authored-by: multica-agent <github@multica.ai> * docs(skills): align builtin skill workflows Co-authored-by: multica-agent <github@multica.ai> * docs(skills): use structured skill search Co-authored-by: multica-agent <github@multica.ai> * fix(skills): make built-in skill bundle launch-ready Co-authored-by: multica-agent <github@multica.ai> * fix(skills): align built-ins with additive skill binding Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add creating agents built-in skill Co-authored-by: multica-agent <github@multica.ai> * Add built-in squads skill Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): rewrite built-in skills as source-traced contracts Rewrite the built-in agent skills to the inbuilt-skill-authoring standard: state source-traced product facts with the source-code link logic as the core, not prescriptive how-to coaching. - creating-agents: drop the Decision-flow / Do-don't-consequences methodology; replace with field/behavior contracts (validation, persisted shape, daemon claim-time consumption, env gating, skill binding). - skill-discovery: stop teaching repo/github_stars as selection signals — searchClawHubSkills never populates them (always null); rank by install_count + source/url + description. Add file:line citations. - mentioning: drop the unbacked "member mention sends a notification" claim (no such path in the comment handler); state that only agent/squad mentions enqueue work. Tighten the parser-failure wording. - working-on-issues: refresh citations drifted by the main merge; describe the PR response `state` enum accurately; trim status coaching. - skill-importing: correct response type to SkillWithFilesResponse; document the reserved SKILL.md supporting-file rule; add line-accurate citations. - squads: correct the "leader cannot be archived" overstatement (not rejected at create/update; fails closed later at routing/dispatch); refresh source-map attributions and test list. Each skill now ships references/<skill>-source-map.md as its evidence layer (line-accurate citations live there, not pinned in the test, so a future main merge cannot rot them into stale lies). builtin_skills_test.go: replace coaching/line-number pins with drift-resistant contract anchors, forbid the coaching phrasing, and require every skill to ship its source-map. The ParseMentions behavior coupling is preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs(skills): close field-role and citation gaps found in review Independent review of the rewritten built-in skills surfaced two real gaps and some citation drift; this fixes them. - creating-agents: add the three missing field rows (visibility, max_concurrent_tasks, mcp_config) to the field-contract table — mcp_config is runtime-consumed (TaskAgentData, daemon.go), visibility is access-control (default private), max_concurrent_tasks is a scheduler cap (default 6). Mark custom_args/runtime_config JSON validation as CLI-side (the server marshals as-is). Correct the CLI body-builder note (description/instructions use a non-empty check, the rest use Changed). Source-map: fix the env query name (UpdateAgentCustomEnv), the conformance test name, and add the new field defaults + the McpConfig runtime-payload line. - mentioning: the @squad mention private gate is canAccessPrivateAgent, not canEnqueueSquadLeader (that wrapper is the assignment/child-done path). - working-on-issues: cite notifyParentOfChildDone at its func def (:51), not the doc comment (:15). - skill-importing: config.origin is set only when the source supplied an origin — note it may be absent; cite createSkillWithFiles at its definition (skill_create.go:72), not the call site. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * Add built-in skills for autopilots runtimes and resources Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): list skill descriptions in the brief Skills index The brief's `## Skills` section emitted bare skill names only, discarding the one-line description that SkillContextForEnv already carries. For Claude-family providers the frontmatter description is loaded natively; for providers without native skill discovery (hermes/default) the brief's list is the only signal they ever see, so a bare name gave them nothing to decide when to load a skill. Emit `name — description` when a description is present, falling back to the bare name when it is empty. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): drop CLI-only rule from working-on-issues The "Platform data goes through the CLI" section duplicated the runtime brief's `## Important: Always Use the multica CLI` section verbatim (and the attachment-via-CLI note duplicated the brief's `## Attachments`). The CLI-only rule is universal and must be known before any skill loads, so the brief is its single source of truth; the skill copy was pure redundancy and a drift risk. Remove it and the matching intro clause. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): remove discovery guidance from built-ins * docs(skills): remove stale skill-necessity records The per-skill necessity records had drifted to 3 of 8 shipped skills plus a record for `multica-skill-authoring`, which is not a shipped built-in skill. Per-skill "why it exists / when to use it" already lives co-located with each skill (frontmatter `description` + `references/<skill>-source-map.md`) where it cannot drift from the skill, and the doc's methodology duplicated the workspace's inbuilt-skill-authoring protocol. Remove the file rather than keep a parallel listing that every new skill has to remember to update. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): add source-authority escape hatch to the brief The brief already tells agents to run `--help` for command discovery, but nothing stated the trust precedence when a skill, the brief, or a doc seems to contradict actual behavior. Add one line to the Available Commands escape-hatch note: trust the live CLI (`--help`/`--output json`) and the checked-out source over source-traced prose that can lag the code, and verify on any conflict or confusion. Kept in the always-on brief (universal, needed before any skill loads) rather than duplicated into each skill; per-skill source-map pointers remain the specific layer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): scope the source-authority escape hatch to the CLI The previous version told agents the "checked-out source is the deeper authority" for verifying behavior. That over-claims: the repos in a task's brief come from GetWorkspaceRepos + project github_repo resources (per-workspace config, see daemon.registerTaskRepos), not the Multica platform source. A generic agent's checked-out source is its own app, not Multica's code, so it cannot verify a Multica skill/brief claim against it. The only universally available authority for Multica behavior is the live CLI (`--help` / `--output json` / observed command behavior). Re-scope the line accordingly and state plainly that the platform's source is not in the workdir. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * revert(runtime): drop the source-authority escape-hatch line Reverts the brief addition from `fdd5e82df` and its follow-up `cc67b2088`. The `--help` discovery fallback already in the Available Commands note is enough; the extra trust-precedence sentence was unnecessary. runtime_config.go is now identical to `6ca27ad74`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs(claude): remind to update built-in skills on CLI/field/behavior changes Add a Coding Rule: when a change touches a CLI command/flag, API field, or product behavior that a built-in skill documents, update that skill's SKILL.md and source-map in the same PR. Lives in the repo dev-guide (read when working in this repo), not the runtime brief — the runtime brief is injected into every workspace, where most agents have no Multica skill to update. AGENTS.md is a pointer to CLAUDE.md, so no mirror needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-03 18:51:03 +08:00
LinYushen	de900b2ba6	feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949) (#3698 ) * feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949) PR3 of the Grafana board metrics split (parent MUL-2328). Adds 23 new Prometheus counter/histogram families to the PR2 BusinessMetrics collector covering the activation/community/commercial funnels, and binds every PostHog event emission to a matching metric increment so the two sides cannot drift. Funnel: signup, workspace_created, team_invite_sent/accepted, onboarding_, cloud_waitlist_joined. Content: issue_created, chat_message_sent, agent_created, squad_created, autopilot_created, issue_executed. Runtime: runtime_registered/ready/failed/offline + ready_seconds histogram, daemon_ws_message_received_total. Autopilot: autopilot_run_started/terminal/skipped. Webhook/GitHub: webhook_delivery_total, github_event_received_total, github_pr_review_total, github_pr_merge_seconds histogram. CloudRuntime: cloudruntime_request_total + duration histogram, wired through a small RequestRecorder interface so the cloudruntime package stays decoupled from metrics. Commercial: feedback_submitted, contact_sales_submitted. The pairing helper metrics.RecordEvent(client, m, ev) emits the PostHog event AND increments the matching counter via IncForEvent dispatch, reading labels from the analytics event Properties. Every existing h.Analytics.Capture(analytics.X(...)) call site has been migrated to the helper across handler/, service/, and cmd/server/runtime_sweeper.go. Lint enforcement (server/internal/metrics/business_pairing_test.go): - TestEveryAnalyticsEventHasPrometheusCounter: every Event constant in analytics/events.go either dispatches via IncForEvent or is in the taskMetricEvents allow-list (PR2 typed RecordTask* methods). - TestNoNakedAnalyticsCaptureInHandlersOrServices: AST-walks handler/ service/cmd-server for direct Analytics.Capture(...) calls — only service/task.go's captureTaskEvent helper is allow-listed. - TestEveryAnalyticsRecordEventTakesAnalyticsHelper: validates the third arg of every metrics.RecordEvent call is built from analytics.. Cardinality protection: all new label values pass through fixed allow-lists in labels_pr3.go; unknown values collapse to 'other'/'unknown'/'error'. Refs: - Spec MUL-2328 / MUL-2949. - Builds on PR2 (MUL-2948) — collectors registered through the same BusinessMetrics struct, no separate Registry. - Uses PR1's taskfailure.Reason (MUL-2946) for runtime_failed's failure_reason label via NormalizeFailureReason. Out of scope: Sampler-class metrics (PR4 / MUL-2947), pr_review_total emission point (no review event handler exists yet — counter is defined, TODO to wire up when /api/webhooks/github grows pull_request_review handling). Co-authored-by: multica-agent <github@multica.ai> fix(server): tighten PR3 review items — signup_source bucket, fill platform/kind/form_source enums, onboarding_started server emission, lint scope (MUL-2949) Addresses 张大彪's review on #3698: 1. signup_source: NormalizeSignupSource added to labels_pr3.go with a fixed allow-list bucket (direct/google/twitter/linkedin/.../other). Parses JSON cookie payload for utm_source/source/referrer fields, strips URL schemes, maps well-known hostnames to channel buckets. PostHog event still ships the raw cookie value for analytics; only the Prometheus label is bucketed. 2. Filled the unknown/other label gaps: - analytics.IssueCreated and analytics.ChatMessageSent now take a platform parameter sourced from middleware.ClientMetadataFromContext (X-Client-Platform header) at the handler. Autopilot-originated issues stamp PlatformServer. - analytics.FeedbackSubmitted now takes a kind parameter; CreateFeedback reads req.Kind (default "general") so the picker selection lights up the metric's kind label instead of long-term "other". - analytics.ContactSalesSubmitted now takes a formSource (page / onboarding / agents_page); CreateContactSales reads req.Source. The metric reads ev.Properties["form_source"] so the analytics CoreProperties.Source ("marketing_contact_sales") stays backward-compat for PostHog dashboards. 3. analytics.OnboardingStarted helper added; server-side emission lives in PatchOnboarding, fired exactly once per user on the first PATCH that carries a non-empty questionnaire payload (firstTouch logic compares prior bytes against {} / null). Frontend onboarding_started keeps firing on page open; the server emission is what guarantees the Prometheus counter exists so Grafana can be cross-checked against the PostHog funnel without depending on the SDK roundtrip. 4. business_pairing_test.go tightened: - TestNoNakedAnalyticsCaptureInHandlersOrServices now allow-lists at function granularity (just captureTaskEvent in service/task.go), not whole-file. Any future naked Capture in the same file fails CI. - TestEveryAnalyticsRecordEventTakesAnalyticsHelper now does def-use tracking inside the enclosing FuncDecl: when RecordEvent's third arg is an ast.Ident, the test walks the function body for the assignment that defined it and confirms the RHS is an analytics.<Helper>(...) call. Bare local idents that didn't originate from analytics are now caught. 5. gofmt -w applied across the touched files; gofmt -l clean. Tests: go test ./internal/metrics/... ./internal/analytics/... pass. Pre-existing TestClaimTask_/TestWebhook_MergedPR/TestDeleteIssueByIdentifier failures on origin/main are DB-environment-dependent and not regressions from this change. Co-authored-by: multica-agent <github@multica.ai> fix(server): normalise onboarding_started platform label + regression test (MUL-2949) Addresses 张大彪's last review nit: - IncForEvent's EventOnboardingStarted case now wraps the platform property with NormalizePlatform, matching every other platform-bearing metric. A misbehaving frontend can no longer leak a raw X-Client-Platform header value into the multica_onboarding_started_total{platform=...} series. - New labels_pr3_test.go covers every PR3 normalizer with both a happy-path value and an unknown value, asserting the unknown collapses to the documented fallback bucket. Includes a focused regression for onboarding_started: emits one event with an attacker-shaped platform string and asserts the metric only exposes web + unknown label values (no raw header bleed). - testutil.go gains a small GatherForTest helper so the regression test can pull the typed MetricFamily map without re-implementing the registry-walk dance. Co-authored-by: multica-agent <github@multica.ai> * fix(server): NormalizeTaskSource on workspace_created + document lint limitations (MUL-2949) Final review touch-ups before merge: - IncForEvent's EventWorkspaceCreated case wraps source through NormalizeTaskSource, matching the other source-bearing dispatches (issue_created, agent_created, issue_executed). Closes the last raw property leak in the dispatcher table. - business_pairing_test.go inline docstrings now spell out the two known limitations of the lint gate that 张大彪 / Eve flagged: analyticsBackedIdents matches by ident NAME (not SSA def-use, so a nested-scope shadow could pass) and isMetricsRecordEvent hard-codes the import alias set. PR description carries a Follow-ups section with the same two items so the work is visible after merge. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: 魏和尚 <agent+wei@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-03 16:39:06 +08:00
Bohan Jiang	5900d8b637	fix(issues): make start_date/due_date timezone-stable calendar days (#3618 ) (#3692 ) * fix(issues): store start_date/due_date as DATE, not timestamp (MUL-2925) These fields are calendar days (the pickers offer no time-of-day), but were stored as TIMESTAMPTZ. A client serializing local midnight via toISOString() folded its timezone into the instant, so the day shifted by the local offset (GH #3618). Migrate the columns to DATE and parse/serialize date-only "YYYY-MM-DD". ParseCalendarDate still accepts legacy RFC3339 (truncated to the UTC day) so older clients keep working. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): render start_date/due_date as timezone-stable calendar days (MUL-2925) Pickers now emit date-only "YYYY-MM-DD" (local calendar day) instead of toISOString(), and every read formats via the shared @multica/core/issues/date helpers with timeZone:"UTC" so the day never shifts with the viewer's offset. The Gantt's existing UTC bucketing is now correct. Covers web/desktop pickers, quick-set menu, list/board/detail/activity, and the mobile due-date picker. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): address date-only review — loud-fail ambiguous dates, finish display sweep (MUL-2925) Review follow-ups on #3692: - ParseCalendarDate no longer silently truncates a legacy non-midnight RFC3339 to the wrong UTC day; it accepts only YYYY-MM-DD or an exact UTC-midnight instant and rejects ambiguous ones loudly. Adds util unit tests. - migration 112 pins the TIMESTAMPTZ->DATE conversion to UTC explicitly via AT TIME ZONE 'UTC' (was session-timezone dependent); down migration too. - Convert remaining date-change display sites to formatDateOnly: inbox detail label (web) and mobile activity + inbox labels (were new Date()+local format). - CLI --start-date/--due-date help now says YYYY-MM-DD, not RFC3339. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-03 14:34:01 +08:00
Multica Eve	a72fb020de	Add business metrics collectors (#3695 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-03 14:32:44 +08:00
Multica Eve	10afd1af1b	feat(server): introduce pkg/taskfailure classifier and switch in-flight failure_reason writes (MUL-2946) (#3693 ) Lift MUL-1949's offline backfill failure_reason taxonomy into a shared in-flight classifier so the agent_task_queue.failure_reason column is written with refined values (provider_auth_or_access, context_overflow, provider_capacity_or_rate_limit, …) at write time rather than waiting on SQL backfill to re-classify after the fact. PR1 of the Grafana board plan in MUL-2328 — the upcoming PR2 reuses pkg/taskfailure.AllReasons() to pre-warm the Prometheus failure_reason label set. * server/pkg/taskfailure: new package with the canonical 21 Reason constants (7 platform-side + 14 agent_error.* sub-reasons), AllReasons() returning a defensive copy, IsAgentError() prefix check, and Classify(rawError) Reason mirroring the SQL CASE rules from MUL-1949 (db-boy's analysis). 100% statement coverage. * server/internal/daemon/daemon.go: route the 'agent_error' coarse fallback paths (StartTask error, runTask early-return error, CompleteTask permanent rejection, reportTaskResult default branch) and the executeAndDrain default error case (chained after classifyPoisonedError) through taskfailure.Classify so blocked / timeout / unknown-status results all carry a refined reason on the wire. * server/internal/service/task.go: FailTask classifies errMsg when the daemon-supplied failureReason is empty, eliminating the legacy COALESCE(.., 'agent_error') landing. * server/internal/daemon/poisoned.go: alias FailureReasonIterationLimit and FailureReasonAPIInvalidRequest to the canonical taskfailure constants. agent_fallback_message and codex_semantic_inactivity are pre-existing operational reasons not in the canonical 21 — kept as literals for now and revisited in a follow-up PR. Backfill SQL from MUL-1949 stays as the authoritative offline source of truth; this PR keeps the in-flight classifier in lock-step with the SQL CASE expression so historical and future rows share the same taxonomy. No behavior change for the platform-side reasons (queued_expired, runtime_offline, runtime_recovery, timeout, etc.) which already align with the canonical set. Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-03 13:52:56 +08:00
Qi Yijiazhen	91c1e51411	feat(editor): add / slash-command palette for invoking agent skills (#3159 ) * feat(editor): add / slash-command palette for invoking agent skills Adds a `/` trigger in the chat box that opens a popover listing the active agent's skills. Selecting an item inserts a `[/label](slash://skill/<id>)` token; the daemon extracts those IDs in `buildChatPrompt` and emits an "Explicitly selected skills:" block using the canonical names from the agent's skill registry — labels are display-only and never trusted. Built on Tiptap's `Mention` extension so the suggestion lifecycle, keyboard routing, and IME handling mirror the existing `@` mention UX. Item list is sourced from the React Query workspace cache (no per-keystroke fetch). Gated behind a new `enableSlashCommands` prop so only `chat-input` opts in; other `ContentEditor` consumers (issue editor, comments) are unaffected. Read-only markdown surfaces render the token as a `.slash-command` pill via a custom link renderer + sanitize-schema/url-transform allowlists. Closes #3108 * fix(i18n): add slash_command editor copy for ko/ja The PR added slash_command popover empty-state keys to en + zh-Hans only; locales/parity.test.ts requires every locale to cover every EN key, so ko and ja failed CI. Add the two keys (no_skills_configured, no_results) matching existing skill terminology (스킬 / スキル). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Naiyuan Qing <145280634+NevilleQingNY@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 15:57:42 +08:00
LinYushen	2348301d2b	fix: gate private squad leader bypass (MUL-2860) (#3648 ) * fix: gate private squad leader from being triggered by unauthorized members Add canEnqueueSquadLeader helper that checks canAccessPrivateAgent before allowing a squad leader to be enqueued. Gate all EnqueueTaskForSquadLeader call sites: 1. enqueueSquadLeaderTask (comment trigger, assign trigger, backlog→todo) 2. triggerChildDoneSquad (child-done → parent squad leader) 3. autopilot.go (defensive comment; actor is always agent → always passes) Also fix validateAssigneePair's squad branch to run canAccessPrivateAgent on the squad leader, returning 403 'cannot assign to squad with private leader' when the actor lacks access. Thread actorType/actorID through notifyParentOfChildDone → dispatchParentAssigneeTrigger → triggerChildDoneSquad so the child-done path can enforce the private-leader gate. Regression tests: - Plain member blocked from create-issue to private-leader squad (403) - Plain member blocked from update-issue to private-leader squad (403) - Owner allowed to assign private-leader squad - Plain member comment on squad-assigned issue doesn't trigger private leader - Child-done by plain member doesn't trigger parent's private leader - Agent actor can still trigger private leader via comment Closes MUL-2860 Co-authored-by: multica-agent <github@multica.ai> * fix: add private-leader gate to autopilot save + dispatch paths - validateAutopilotAssignee squad branch: call canAccessPrivateAgent on the leader, returning 403 for unauthorized members at save time. - service/autopilot.go: add canCreatorAccessPrivateLeader helper that mirrors the handler-level canAccessPrivateAgent logic (agent creators pass; member creators must be owner/admin or agent owner). - Gate both dispatch paths (dispatchCreateIssue and dispatchRunOnly) with fail-closed check: if leader is private and creator lacks access, the run is skipped instead of triggering the private leader. Regression tests: - Plain member create autopilot to private-leader squad → 403 - Plain member update autopilot to private-leader squad → 403 - Owner create autopilot to private-leader squad → 201 - Owner-created autopilot dispatch → issue_created (positive) - Legacy plain-member-created autopilot dispatch → skipped (fail-closed) Co-authored-by: multica-agent <github@multica.ai> * test: add run_only legacy private-leader squad dispatch regression test Covers the dispatchRunOnly path explicitly, complementing the existing create_issue dispatch test. Both dispatch branches now have direct test coverage for the private-leader fail-closed gate. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-06-02 15:47:57 +08:00
Naiyuan Qing	4ae4722ef0	fix(comments): preserve direct parent on replies (#3579 ) Co-authored-by: multica-agent <github@multica.ai>	2026-06-01 08:28:15 +08:00
fengchangguo-star	2cf8107fc8	feat(email): support implicit TLS (SMTPS/465) for SMTP relay (MUL-2768) (#3340 ) * feat(email): support implicit TLS (SMTPS/465) for SMTP relay The SMTP relay previously only did opportunistic STARTTLS: it dialed plaintext and upgraded if the server advertised STARTTLS. Providers that only offer implicit TLS on port 465 and do not advertise STARTTLS (e.g. Aliyun enterprise mail) could not be used as a relay at all. Add an SMTP_TLS env var: - unset / starttls (default): unchanged STARTTLS-upgrade behavior. - implicit / smtps / ssl: dial with tls.DialWithDialer (SMTPS). Implicit TLS is auto-enabled when SMTP_PORT=465 and SMTP_TLS is unset, so the common case works with no extra config. The startup log line now reports the negotiated mode (starttls / implicit-tls). Co-authored-by: multica-agent <github@multica.ai> * feat(email): plumb SMTP_TLS through selfhost compose, warn on unknown values The backend reads SMTP_TLS but docker-compose.selfhost.yml never forwarded it, so SMTP_TLS=implicit on a non-standard port (or an explicit starttls override on 465) silently did nothing inside the container. Add it to the backend.environment block. Also log a one-line warning when SMTP_TLS is set to an unrecognized value (e.g. "tls"/"true"/"on"), which would otherwise fall through to STARTTLS and fail to dial a 465 SMTPS port with no startup hint. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(email): cover SMTP_TLS precedence and alias resolution Table-driven test over NewEmailService asserting the implicit-TLS decision: 465 auto-enables implicit; explicit starttls on 465 overrides auto-detect; implicit/smtps/ssl aliases (case-insensitive, whitespace-trimmed) force SMTPS on any port; unknown values fall back to starttls. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs: document SMTPS / SMTP_TLS support, drop "465 unsupported" Port 465 implicit TLS is now supported, so the five places that said it was unsupported are wrong. Replace those sentences, add an SMTP_TLS row to the environment-variables tables (EN + ZH), and add a copy-pasteable SMTPS env block to the auth-setup pages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: guofengchang <guofengchang@cumulon.com> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 18:15:04 +08:00
Naiyuan Qing	3187bbf90c	feat(comments): re-add since-delta + cold-start thread read + parent-root write normalization (#3494 ) * feat(comments): since-delta new-comment hint + default-on comment session resume (#3432) * feat(db): add unresolved comment count + list filter queries Add CountUnresolvedComments (excludes the agent's own comments) and ListUnresolvedCommentsForIssue. Both are additive — existing callers stay on the unfiltered queries — so old clients are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): support unresolved-only comment listing Wire an additive `unresolved` query param into ListComments. Defaults off so an old CLI that never sends it gets unchanged behavior; only true/1 enable it. Rejects combining unresolved with thread/recent (whole-issue filter vs navigation models). Includes filter + count query tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): plumb unresolved count + thread root into claim, gate comment resume Populate trigger_parent_id (thread root of the trigger comment) and unresolved_count (excludes the agent's own comments) on comment-triggered claim responses. Both fields are omitempty so old daemons ignore them. Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION (default off): resumed comment turns can inherit the prior turn's "Done." final message, so this stays an explicit rollout switch. The runtime-match and poisoned-session guards still apply regardless of the flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(daemon): inject unresolved-comments hint + resolve step into agent brief Add a shared BuildUnresolvedCommentsHint helper rendered on both the per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It ships only the count and the relevant CLI call — never comment bodies — so the server stays cheap. Thread case points at --thread <root>; issue case points at --unresolved. Suppressed when the count is 0. Also add a workflow step telling the agent to `multica comment resolve <thread-root>` once a thread is fully handled, so the unresolved set converges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add comment list --unresolved and comment resolve command Add an --unresolved filter to `issue comment list` (wired to the server's unresolved param, rejected when combined with --thread/--recent) and a top-level `comment resolve <id>` command that POSTs to the existing /api/comments/{id}/resolve endpoint, letting an agent close threads it has fully handled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(comments): since-delta new-comment hint + default-on comment resume Simplifies the comment-triggered agent flow down to what's actually needed: - New-comment awareness is now a pure time delta: the claim response carries new_comment_count + new_comments_since (anchored on the prior run's started_at, never completed_at so a long run can't miss comments). The per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s) since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the two surfaces can't drift. Cold start (no prior run) falls back to a plain read. - Comment-triggered tasks resume the prior session by default (same runtime), dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS comment" prompt guard defends against inheriting the prior turn's "Done." marker; GetLastTaskSession still excludes poisoned sessions. - Drops the resolved-based machinery from the first draft: CountUnresolvedComments / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved` flag, the `multica comment resolve` command, and the resolve workflow step. - Removes the verbose cursor-pagination paragraph from the comment prompt; the --thread/--recent/--since flags stay in the CLI/API, just no longer explained inline every turn. Compatibility: new claim fields are omitempty (old daemons ignore them). Comment resume is default-on and affects even old daemons, which already consume prior_session_id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(comments): collapse reply parent_id to thread root on write Comment threads are a 2-level model (root + flat replies, like Linear/Slack), enforced today only by the UI and the agent path — the CreateComment handler stored whatever parent_id it was handed, and the agent-side flatten walked just one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a recursive walk to the parent_id=NULL root) and run both write paths (handler.CreateComment, service.createAgentComment) through it, so every stored reply's parent_id IS its thread root. Readers can now treat parent_id as the thread root without re-walking. The agent-drift guard still compares the raw parent_id to the trigger comment before normalization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(comments): cold-start reads triggering thread, warm keeps --thread pointer The since-delta rework dropped the thread-first read on the COLD path: a first-time agent fell back to the flat `comment list` dump (oldest-first, cap 2000), burying the trigger's context in ancient chatter. Point cold start at the triggering conversation instead via a shared BuildColdCommentsHint (`--thread <trigger> --tail 30` + a --recent pointer for cross-thread background). On the WARM path, --since is a pure time delta and can miss the triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a --thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via the shared helpers so they cannot drift (PR #2816 rule). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 10:38:37 +08:00
QJ Yu	56bddc5e06	fix(issues): place new issues at top of column in manual sort mode Fixes PER-145.	2026-05-28 14:20:20 +08:00
Bohan Jiang	e3723dbb22	refactor(autopilot): centralize timezone default and cover invalid-timezone fallback (MUL-2742) (#3356 ) Follow-up nits from PR #3324 review: - Export DefaultAutopilotTriggerTimezone so the autopilot scheduler reuses the same source-of-truth as the service layer instead of hardcoding "UTC" in two places. - Add tests that lock down the invalid-timezone fallback (e.g. "Foo/Bar") for both buildIssueDescription and interpolateTemplate, so a future change to the resolve/format helpers can't silently emit a half-formatted timestamp or date. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-27 15:40:05 +08:00
YOMXXX	607e64d722	fix(autopilot): render trigger output in trigger timezone (#3324 )	2026-05-27 15:26:49 +08:00
Bohan Jiang	17714c3ad1	fix(create-issue): preserve parent_issue_id through Create with agent flow (MUL-2534) (#3083 ) * fix(create-issue): preserve parent_issue_id through Create with agent flow (MUL-2534) When the create-issue modal was opened from the "Add sub issue" entry on an existing issue and the user switched to "Create with agent", the parent_issue_id was silently dropped: switchToAgent only forwarded prompt + actor + project_id, the AgentCreatePanel had no notion of parent context, and the daemon prompt never instructed the agent to pass --parent <uuid>. The sub-issue intent was lost and the new issue landed as a standalone. This fix threads parent_issue_id through the whole pipeline silently — no new editable form field, the existing carry channel handles it: - Frontend: ManualCreatePanel.switchToAgent + AgentCreatePanel.switchToManual now carry parent_issue_id (and identifier, for display) so the sub-issue intent survives mode flips in either direction. AgentCreatePanel reads parent from `data`, forwards to api.quickCreateIssue, and renders a read-only "Sub-issue of MUL-XX" chip so the user can see the relationship. - API: quickCreateIssue accepts optional parent_issue_id. - Backend: QuickCreateIssueRequest validates parent_issue_id belongs to the same workspace (same path as CreateIssue), persists it in QuickCreateContext, and the daemon claim handler resolves the parent's identifier for prompt context. - Daemon prompt: when ParentIssueID is set, buildQuickCreatePrompt instructs the agent to pass `--parent <uuid>` and treat the modal entry point as authoritative. Tests cover all three hops: switchToAgent carry payload, AgentCreatePanel → api.quickCreateIssue, and the daemon prompt's --parent injection (with both identifier-present and UUID-only fallback branches). Co-authored-by: multica-agent <github@multica.ai> * test(create-issue): cover quick-create parent trust boundary + identifier fallback (MUL-2534) Address review on PR #3083: - Add server-side test for POST /api/issues/quick-create parent_issue_id: same-workspace parent threads through QuickCreateContext.ParentIssueID, foreign-workspace and bogus UUIDs return 400 and never enqueue a task. - Fall back to `data.parent_issue_identifier` in ManualCreatePanel's switchToAgent when the parent detail query hasn't hydrated yet, so the agent chip never renders "Sub-issue of " with an empty tail. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-27 14:18:48 +08:00
Bohan Jiang	341ce7bfa5	feat: support local working directory for projects (MUL-2618 v1) (#3283 ) * feat(project): add local_directory project_resource type (MUL-2662) Adds a second project_resource type alongside github_repo so a project can be pinned to an existing directory on a specific daemon (the v1 of the local-working-directory flow tracked in MUL-2618). The ref schema is { local_path, daemon_id, label? }; local_path must be absolute and daemon_id is required. The same (daemon_id, local_path) pair is allowed on multiple projects by design — no UNIQUE constraint is added. Implementation reuses the existing project_resource API surface: the new type is wired through the validator switch with no migration, no new events, and no daemon-handler changes (daemon already passes through arbitrary resource types via ProjectResources). The CLI gains --local-path / --daemon-id / --ref-label shortcuts so `multica project resource add --type local_directory` mirrors the existing `--type github_repo --url ...` ergonomics; the generic --ref flag still works for both types. Tests cover the full CRUD lifecycle, the same-path-across-projects allowance, the same-path-same-project conflict, the validator rejections (missing/blank/relative path, missing daemon_id, wrong payload type), and the cross-platform isAbsoluteLocalPath helper. Co-authored-by: multica-agent <github@multica.ai> * feat(project): add update endpoint + label-shadow guard for project_resource (MUL-2662) Addresses the Elon review on PR #3263: - Add PUT /api/projects/{id}/resources/{resourceId} with sqlc query, matching handler, CLI `project resource update`, and a new EventProjectResourceUpdated WS event. resource_type stays immutable; ref/label/position are all individually optional. - Catch same-project (daemon_id, local_path) collisions where only the embedded label differs — the row-level UNIQUE only matches the full ref JSON, so a label typo would otherwise let the same working directory bind twice. - Tests cover the update lifecycle (label-only / ref / clear / 404 / invalid path) and the label-shadow conflict on both create and update; the in-place rename still succeeds because the conflict scan ignores the row being edited. Incidental: regenerating sqlc picked up a missing skills_local scan in UpdateAgentCustomEnv that drifted in from #3200. Co-authored-by: multica-agent <github@multica.ai> * fix(project): close bundled-create label-shadow gap + merge resource_ref on CLI update (MUL-2662) Two follow-ups from MUL-2662 review round 2: - CreateProject inline resources path now dedupes local_directory entries on (daemon_id, local_path) before opening the transaction. The DB-level UNIQUE(project_id, resource_type, resource_ref) constraint only fires on a full JSON match, so two rows with the same target but different `label` would otherwise slip past. Standalone POST/PUT already cover this via findLocalDirectoryConflict; bundled create was the missing surface. - `multica project resource update` now seeds resource_ref from the existing row before applying per-type shortcut flags, so `--default-branch-hint x` on its own no longer constructs a payload missing `url` (which the server 400s on). Local_directory partial edits get the same merge behavior. Co-authored-by: multica-agent <github@multica.ai> * feat(desktop): local_directory project_resource UI (MUL-2665) (#3273) * feat(desktop): local_directory project_resource UI (MUL-2665) First UI surface for the local-working-directory flow tracked in MUL-2618. Lets users on the desktop pin a project to an existing folder on this machine; web stays read-only since the per-daemon check can't be done in the browser. What's new for the renderer: - ProjectResourcesSection grows a desktop-only "Add local directory" button next to the existing GitHub-repo popover. Clicking it opens Electron's native folder picker, validates the path through a new IPC pair (existence + r/w), and submits a project_resource of resource_type=local_directory with daemon_id pulled live from daemonAPI.getStatus. - LocalDirectoryRow renders the rename pencil + path tooltip, and greys out when ref.daemon_id != this machine's daemon_id (with a "only available on the machine that registered this directory" tooltip). Delete stays enabled so users can drop stale registrations from any device. - LocalDirectoryHint sits above the issue-detail comment composer and shows "Agent will work in-place at {label} ({path})" when the issue's project has a local_directory matching this daemon. Hidden on web. - TaskStatusPill picks up a new "waiting_for_directory_release" stage that the daemon will publish when it dequeues a task but can't acquire the path lock. The render is in place now so the daemon sibling subtask can wire the status string without an additional UI PR. Plumbing: - @multica/core/types gains LocalDirectoryResourceRef + UpdateProjectResourceRequest, and the api client gets the matching PUT method backed by the server endpoint that landed in `2ac3faebb` (MUL-2662). A useUpdateProjectResource hook drives the in-place label edit. - New Electron handlers under apps/desktop/src/main/local-directory.ts: local-directory:pick -> dialog.showOpenDialog (openDirectory) local-directory:validate -> stat + access(R_OK + W_OK) exposed through the preload as desktopAPI.pickDirectory / validateLocalDirectory. View code talks to them via a thin packages/views/platform helper that returns reason=unsupported on web instead of crashing. - useLocalDaemonStatus exposes the local daemon's id, device name, and running flag from daemonAPI.onStatusChange so the renderer can do the cross-device match without coupling to the desktop preload typings. Tests: - pickStageKeys gets a unit test covering the new stage and proving the directory-release status outranks availability hints. - LocalDirectoryHint tests cover the four render branches (no project, no daemon, foreign daemon, matching daemon). - i18n parity stays green; new keys added under projects.resources.* and chat.status_pill.stages.waiting_for_directory_release in both locales. Out of scope (will land separately): - The daemon-side waiting/lock signal that flips the pill into the new state. - Adding local_directory to the create-project modal's bulk attach flow. - Docs page refresh for project-resources.mdx — left for the MUL-2618 umbrella sweep. Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): hide rename for foreign daemon local_directory rows (MUL-2618) Address review nit on #3273: the rename pencil was gated only by `canEdit`, so a foreign / unknown-daemon row still showed it even though the spec says cross-device rows are disabled. Gate rename on `!mismatch` so it disappears on those rows; delete stays available so a stale registration can still be dropped from any device. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> * feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) (#3274) * feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) Wires up the daemon side of the local_directory project_resource introduced in MUL-2662. When a task is dispatched against a project whose resources include a local_directory pinned to this daemon's UUID, the daemon now: - Validates the path (absolute, exists, daemon process can read+write, not in the system-root / $HOME blacklist) and fails the task fast on any precondition violation, with a user-readable reason. - Serialises concurrent tasks on the same on-disk path via a daemon-local LocalPathLocker keyed by symlink-resolved realpath. The lock is held for the entire task lifetime (claim → context write → agent → result report). - When the lock is contended, the daemon flips the row to a new waiting_local_directory status on the server (carrying a wait_reason like "<path> (held by task <short id>)") so the UI can render "等待本地目录释放" instead of leaving the row silently in dispatched past the sweeper timeout. The status accepts being woken into running once the lock is acquired. - Sets execenv.WorkDir to the user's path (no copy, no mount). envRoot still lives under workspacesRoot/<wsID>/ and hosts output/, logs/, and .gc_meta.json — the daemon's logbook for the run. - Stamps GCMeta.LocalDirectory=true so the GC loop never RemoveAlls envRoot for these tasks (gcActionClean → gcActionCleanArtifacts, gcActionOrphan → gcActionSkip). The user's directory was never under envRoot to begin with, so this is defense in depth. - Skips execenv.Reuse for local_directory tasks because the prior WorkDir is the user's path and reusing it through that code path loses the envRoot association the GC loop needs. Prepare is cheap here (no clone, no copy), so always running it is fine. Server-side protocol changes: - New CHECK value 'waiting_local_directory' on agent_task_queue.status plus a wait_reason TEXT column (migration 109). - All cancel / active / counted-as-running / orphan-recovery queries expanded to include the new status; FailStaleTasks intentionally excludes it (the daemon owns the wait). - New SQL MarkAgentTaskWaitingLocalDirectory(id, reason) and a relaxed StartAgentTask that accepts both dispatched and waiting_local_directory as preconditions (and clears wait_reason on the way through). - New POST /api/daemon/tasks/{taskId}/wait-local-directory endpoint, TaskService.MarkTaskWaitingLocalDirectory broadcaster, and matching daemon Client.MarkTaskWaitingLocalDirectory. Tests cover: path blacklist + R/W enforcement, mutex serialisation + ctx-cancelled wait, lock handover between two tasks, GC never returns gcActionClean / gcActionOrphan for local_directory rows (with negative control for the standard path), and Prepare/Cleanup correctly substitute + protect the user's WorkDir. The desktop UI side (UI for adding a local_directory resource, surfacing the "等待本地目录" badge) is MUL-2665; the agent-task lifecycle changes (no branch switch, dirty-tree tolerant, auto-commit) are MUL-2664. This PR targets the shared MUL-2618 v1 feature branch agent/j/912b8cb1, not main; the whole v1 will be merged to main together when complete. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): tighten local_directory status, symlink, cancel handling (MUL-2618) Address the 3 must-fix items from Elon's review of PR #3274. 1. Status string unified. The server / daemon publish `waiting_local_directory`; align views, locales, and the pickStageKeys test (PR #3273 had used `waiting_for_directory_release` on a placeholder string). Without this, the daemon's wait state never reached the pill once the two siblings merged. 2. validateLocalPath now also runs the blacklist against the symlink-resolved realpath, with macOS's `/etc` -> `/private/etc` redirect handled via `isBlacklistedRealPath` which compares canonical forms. Without this, a symlink such as `/Users/me/proj/home -> /Users/me` slipped the literal $HOME check while every daemon write still landed in the user's home. Tests cover symlink-to-home, symlink-to-system-root, and the negative case (symlink to a regular subdirectory). 3. acquireLocalDirectoryLockIfNeeded now spins up a cancellation watcher inside `onWait` (lazy — the fast path stays free) so the gap between dispatch and StartTask responds to server-side cancel or row deletion. If the watcher fires while the daemon is parked on the path mutex, the lock-wait context is cancelled, Acquire returns promptly, and the helper exits silently the same way the run-phase poller does. New TestAcquireLocalDirectoryLock_CancelDuringWait exercises the path end-to-end with a fake server. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): unconditional canonical blacklist + Windows drive-root generalisation (MUL-2618) - validateLocalPath now always runs isBlacklistedRealPath on the symlink-resolved path, not only when it differs from absPath. The old guard let users type the canonical form of an OS-symlinked banned root (e.g. /private/tmp, /private/etc, /private/var on macOS) straight through, since EvalSymlinks is a no-op on already-canonical input. - Windows drive-root rejection moved off the static C/D/E/F enumeration onto filepath.VolumeName via a new isDriveRoot helper, so removable / network drives mounted at G:..Z: and UNC \\server\share roots are also blocked. systemRootBlacklist keeps the well-known C:\ trees only. - Tests: macOS-only case exercises direct /private/{tmp,etc,var}; a new TestIsDriveRoot covers the Windows generalisation (skipped on POSIX runners by runtime guard). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> * feat(views): wire waiting_local_directory end-to-end in issue UI + presence (MUL-2618) Connect the daemon-emitted `task:waiting_local_directory` and `task:running` events through to issue execution log, sticky agent banner, activity indicator, and agent presence so a parked task is no longer invisible on the issue page. - Add `waiting_local_directory` to `AgentTask.status` and the typed `task:running` / `task:waiting_local_directory` WS event payloads. - Chat realtime sync writes both new statuses into the pending-task cache so the chat StatusPill flips out of a stale `dispatched` frame. - ExecutionLogSection: count `waiting_local_directory` as active, add tone + status label, treat parked tasks the same as dispatched for time anchor / transcript visibility / terminate-confirm note. - AgentLiveCard: subscribe to both new events, rank the parked state between dispatched and queued, and surface a "is waiting for the local directory" banner with the muted "Clock" treatment used for queued. - IssueAgentActivityIndicator: route parked tasks into the queued bucket so the hover stack and chip stay visible. - derive-presence: parked tasks count toward `queuedCount` so the agent workload chip stays out of `idle` while the daemon waits on the path lock. - Locales: add `agent_live.is_waiting_local_directory` and `execution_log.status_waiting_local_directory` (en + zh-Hans). Co-authored-by: multica-agent <github@multica.ai> * feat(project): enforce one local_directory per (project, daemon) (MUL-2618) The daemon-side resolver picks the first matching local_directory by daemon_id, so allowing two rows on the same daemon — even at different paths — let the agent silently write into whichever sorted first. Tighten the invariant top to bottom: - server: `findLocalDirectoryConflict` rejects any second row sharing a daemon_id, regardless of `local_path` or label. Bundled-create surface in `CreateProject` runs the same daemon-scoped dedupe up front. - daemon: `findLocalDirectoryAssignment` fails fast when it finds more than one row pinned to the current daemon (older API client / direct DB writes can still produce that state — refuse to guess). - desktop UI: hide the "Add local directory" action once the current daemon owns a row on this project, with a hint and a defensive toast on the call path; foreign-daemon rows stay visible read-only as before. - Tests: * daemon: new `two local_directory rows on this daemon fail fast` / `local_directory rows on different daemons coexist` cases. * handler: rewrite the legacy `LabelShadow` cases as `DaemonScopedConflict` / `BundledLocalDirectoryDaemonConflict` — asserts 409 on same-daemon different-path, 201 on per-daemon bundles. - Locales: en + zh-Hans copy for the new hint + toast. Co-authored-by: multica-agent <github@multica.ai> * chore(sqlc): drop stale skills_local in UpdateAgentCustomEnv (MUL-2618) Follow-up to the main-merge in `0f8e8ca7`: the auto-merge preserved most of main's skills_local revert but kept the column reference inside the UpdateAgentCustomEnv scanner because that block hadn't been touched by either side. Re-running `sqlc generate` regenerates the file without skills_local in this query, matching the rest of the file and the post-revert schema. Co-authored-by: multica-agent <github@multica.ai> * feat(create-project): binary source picker — repos OR local directory Turn the create-project dialog's "Repos" pill into a binary Source picker. A project's source is mutually exclusive: either a set of GitHub repos (worktree mode, default) or a single local working directory (local mode, desktop-only). Mirrors the constraint the backend will enforce next. Behavior: - Pill shows the active mode's selection (GitHub icon + repo count, or folder icon + local label/path). - Popover has a 2-tab segmented control at the top; the Local tab is hidden entirely on web (local_directory needs a daemon_id). - Local tab requires the daemon online — amber notice + disabled picker when offline, re-renders automatically via useLocalDaemonStatus. - Switching tabs preserves the other side's stash, but handleSubmit only emits the resource matching the active sourceMode, so abandoned picks never leak into the created project. Backend mutual-exclusion validation + the resources-section conditional-add-button still to come — this PR just unblocks the dialog so it can be demoed. * fix(mobile): cover waiting_local_directory in run row status maps (MUL-2618) --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Multica J <j@multica.ai>	2026-05-27 13:44:31 +08:00
Bohan Jiang	13f74e651a	feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) (#3209 ) * feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) The agent resource shape (list / get / create / update / archive / restore responses + WebSocket events) no longer carries `custom_env` values. Reads/writes of env now flow exclusively through a dedicated `/api/agents/{id}/env` endpoint that is owner/admin-only, rejects agent-actor sessions, applies a "**" sentinel preserve guard on PUT, and writes a persistent audit row per reveal/update. Why - `multica agent list --output json` historically returned plaintext `custom_env` for owner/admin callers (the redaction gate gave only members the masked map). Any agent token running on the workspace inherits its owner's role and could read every other agent's secrets just by listing. - Patching list/get redaction alone (PR #3175 direction) left symmetric leaks via mutation responses, WS events, the "reveal" path itself (no actor-aware auth), and a `` overwrite footgun on UpdateAgent. What changed - Backend: drop `custom_env` from AgentResponse; add coarse `has_custom_env` + `custom_env_key_count`. Strip env handling from UpdateAgent (silently ignored if sent). Keep CreateAgent's custom_env acceptance. - Backend: new GET/PUT `/api/agents/{id}/env` handlers in `internal/handler/agent_env.go`: - resolveActor → 403 for agent actors (closes the lateral-movement path). - Owner/admin role gate via existing helper. - PUT honours value == "*" as "preserve existing value". - Both write to `activity_log` with `agent_env_revealed` / `agent_env_updated` actions. Audit details record key names only, never values. - Daemon claim path (`ClaimAgentTask`) unchanged — `TaskAgentData` still carries plaintext env for runtime injection. - SQL: new `UpdateAgentCustomEnv` query; sqlc regenerated (v1.31.1). - CLI: new `multica agent env get\|set` subcommands. `--custom-env` flags removed from `multica agent update`; the no-fields error now points to the new path. - Frontend: drop env fields from `Agent` + `UpdateAgentRequest`; add `getAgentEnv` / `updateAgentEnv` client methods; rewrite env-tab to show "N variables configured" + explicit "Reveal & edit" button, fetching values only on intentional reveal. - Locales: parity-safe additions to en + zh-Hans. - Docs: agents-create.{mdx,zh.mdx} reflect the new threat model and endpoint. - Mobile: schema drops `custom_env` / `custom_env_redacted`, adds metadata fields. Tests - Handler tests pinned the new invariants: no env in list/get responses, owner reveal happy-path + audit row, agent-actor 403, `***` sentinel preserves real values, UpdateAgent silently ignores `custom_env`, pure `mergeAgentEnv` cases. - CLI tests pivot to the new flag surface: `agent update` MUST NOT expose the env flags; `agent env set` MUST expose --custom-env-stdin/--custom-env-file. - Frontend test fixtures updated; pnpm typecheck / test / lint pass cleanly. This is a breaking API change. Scripts that read `custom_env` from `/api/agents` must migrate to `GET /api/agents/{id}/env`. Co-authored-by: multica-agent <github@multica.ai> fix(agents): close actor-spoofing + audit fail-closed in env endpoints (MUL-2600) Addresses Elon's review of #3209: * Mint a task-scoped `mat_` token per claim, bound to (agent, task, workspace, owner). Daemon injects it into the agent process in place of its own credential. Auth middleware authoritatively rebuilds X-User-ID / X-Agent-ID / X-Task-ID from the token row and sets X-Actor-Source=task_token; that header is server-set only — incoming values are stripped before any auth branch runs. resolveActor honors the header so an agent that strips X-Agent-ID / X-Task-ID still resolves as actor=agent. * GetAgentEnv / UpdateAgentEnv are now fail-closed on audit-log failures: GET refuses to return plaintext, PUT persists inside the same tx as the audit row so they commit/roll back together. * PUT /api/agents/{id} returns 400 when the body carries custom_env instead of silently dropping it — directs callers to the audited env endpoint. * Agent actors never see mcp_config, even when the underlying member is owner/admin; mutation broadcasts go through a redaction shim so WS subscribers don't pick it up either. * Fix backend test that asserted dense JSON (jsonb::text renders whitespace) and frontend test that assumed a unique "Test User" match. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): close residual MUL-2600 gaps from review (MUL-2600) Migration 108 FK now correctly references agent_task_queue(id) instead of the non-existent agent_task table; the previous name blocked CI backend migrations. Task-token-authenticated requests can no longer be re-routed at a different workspace by passing workspace_slug / workspace_id / ?workspace_id / a URL workspace param. ResolveWorkspaceIDFromRequest and resolveWorkspaceUUID both short-circuit on X-Actor-Source=task_token and return only the token-bound X-Workspace-ID; buildMiddleware adds a defence-in-depth 403 if any URL-resolved workspace disagrees with the token binding. mcp_config no longer leaks back to agent actors through UpdateAgent / CreateAgent / ArchiveAgent / RestoreAgent HTTP responses — the same redactAgentResponseForActor helper that GetAgent/ListAgents use is now applied to mutation responses too. WS broadcasts were already redacted via broadcastAgentResponse. FailTask and every TaskService cancel path (CancelTask / CancelTasksForIssue / CancelTasksForAgent / CancelTasksByTriggerComment / BroadcastCancelledTasks) now eagerly DeleteTaskTokensByTask so the mat_ token's 24h window doesn't outlive a terminated task. Failure is non-fatal — the FK cascade and expiry remain durable guards. Doc-only: clarify that PUT /api/agents/{id} now hard-rejects bodies that carry custom_env (was previously "silently ignores"). Tests: - middleware: TestResolveWorkspaceIDFromRequest gains a task_token case asserting client-supplied slug/id/query cannot override the bound workspace. - handler: TestUpdateAgent_RedactsMcpConfigForAgentActor and TestUpdateAgent_KeepsMcpConfigForMemberActor pin the mutation- response redaction contract per actor type. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): match redacted mcp_config as JSON null, not Go nil (MUL-2600) `AgentResponse.McpConfig` is `json.RawMessage` without `omitempty`, so the redacted response serialises as `"mcp_config": null`. On decode, `json.RawMessage` keeps the literal bytes `null` rather than collapsing to Go nil, which made the assertion fire on a non-leak. The product contract (field always present, distinguished from "no config" via `mcp_config_redacted`) is intentional, so adjust the test to check for "no secret-bearing content" instead of weakening the contract via `omitempty`. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-25 18:42:48 +08:00
Tom Qiao	1c91c2a3b2	security(db): scope DELETE/UpdateIssueStatus by workspace_id (defense-in-depth) (#3027 ) * fix(security): scope DELETE/UpdateIssueStatus by workspace_id Add workspace_id to the WHERE clause of DeleteIssue, DeleteComment, DeleteProject, DeleteSkill, DeleteChatSession, and UpdateIssueStatus as SQL-layer defense-in-depth. Handler loaders (loadIssueForUser / loadSkillForUser / etc.) already enforce workspace membership today, so this is not patching a known live vuln. But the tenant invariant is currently a handler-layer guarantee — a future loader bypass or a new caller skipping the loader would be silently catastrophic. Making workspace_id part of the SQL identity collapses the trust surface to the schema itself: forging a sibling-workspace UUID becomes ErrNoRows instead of a cross-tenant write. Reference: incident #1661 (util.ParseUUID silent zero UUID returning 204 on a DELETE that matched zero rows) — same class of failure, prevented at a different layer. Scope: - 5 DELETE queries: issue, comment, project, skill, chat_session - 1 simple UPDATE: UpdateIssueStatus (2 narg, no SET ordering risk) - All callers updated (handlers, service, runtime sweeper fallback) Multi-narg UPDATE queries (UpdateIssue, UpdateProject, UpdateSkill, UpdateComment, UpdateChatSession) are deferred to a follow-up to keep this change reviewable: each needs its narg pinning shifted and per-caller verification. sqlc was regenerated by hand (no local sqlc toolchain); CI's backend job is the authoritative compile check. test(security): add workspace_scope_guard regression test Locks in the SQL-layer tenant guard added in this PR. For each of the 6 scoped queries (DeleteIssue, DeleteComment, DeleteProject, DeleteSkill, DeleteChatSession, UpdateIssueStatus), creates the resource in workspace A, invokes the query with a foreign workspace UUID, and asserts the row is untouched (0 rows affected with no error for :exec; pgx.ErrNoRows for :one). A future refactor that drops the workspace_id arg from any of these queries will now fail loudly instead of silently regressing. Includes a sanity sub-test that the in-workspace path still mutates, so a buggy guard that returns no-op for every call would not pass. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com> --------- Co-authored-by: Tom Qiao <tomqiaozc@users.noreply.github.com> Co-authored-by: Claude Opus 4 <noreply@anthropic.com>	2026-05-22 14:39:47 +08:00
Naiyuan Qing	fedd0f1694	feat(issues): live agent activity chip + per-issue indicator + filter (#3058 ) * feat(server): broadcast task:running event The dispatched → running transition was silent: only task:queued, task:dispatch, task:cancelled, task:completed and task:failed broadcast over WS. Any UI that distinguishes "queued" from "running" (e.g. the new issue-card agent activity indicator) would lag by up to the 30s agentTaskSnapshot staleTime on the most user-visible transition. StartTask now broadcasts task:running so the workspace snapshot invalidates immediately, keeping the agent activity UI live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(issues): live agent activity chip + per-issue indicator + filter Surfaces "which agents are working on what, right now" in the Issues and My Issues views, with a one-click filter to narrow the list to issues that have a running agent task. Two visual surfaces: - Workspace chip in the header (left of Filter). Shows the brand-tinted avatar stack of agents currently running on visible issues. Click toggles a page-scoped filter; idle state renders a static "0 working" button with a hover-card placeholder. When the filter is active the chip pins to brand fill across hover and popover states (the Button outline variant otherwise repaints back to neutral). A muted "Viewing only working agents" hint sits to the left of the chip whenever the filter is on, so users notice the active state without having to hover. - Per-issue indicator on every board card and list row (top-right of the identifier line). Renders the avatar stack of agents in running or queued state on that issue, full-opacity ring at brand/70 when ≥1 is running, half-opacity stack when only queued. Returns null when nothing is in flight. Both surfaces open the same hover-card body that lists each active task with the agent avatar, status dot (composed via the existing availability + workload tokens), and a live-ticking duration. Adds a new "All" scope to /my-issues that unions assignee, creator, and involves_user_id via three parallel fetches deduped on the client — no backend changes for this part. The chip's count and the quick-filter both use the page's currently visible issue ids so they stay in sync with the active scope. State is per-user (Zustand + localStorage) and the agentRunningFilter is intentionally omitted from partialize — running state changes second-to-second and a stored toggle would land users in an unexplained empty list. WS task:running, already added in the preceding commit, drives real-time updates without polling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(issues): swap indicator ring pulse for shimmer text label Earlier iterations layered a brand ring with various opacity-pulse cadences around the per-issue avatar stack. Every tuning attempt was either invisible (transparent ring + faded pulse) or oppressive (a visible ring that flashed on a dense board). Moves the "alive" signal onto a small text label and reuses chat's existing `animate-chat-text-shimmer` utility — a soft light sweep across the glyphs that already powers the ChatGPT-style "thinking" cue in task-status-pill. Indicator now reads as a 12 px avatar stack + 10 px label: - Running → full-opacity avatars + shimmering localized "Working" - Queued → half-opacity avatars + muted static "Queued" - Idle → render nothing (unchanged) Avatars and the surrounding card stay completely still; only the few glyphs animate. The label is i18n-driven via the existing `status_running` / `status_queued` keys, so no locale changes are required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:20:42 +08:00
YOMXXX	29c2a5d18f	fix(daemon): reclaim stale dispatched claims (MUL-2485) (#2872 ) * fix(daemon): reclaim stale dispatched claims * fix(daemon): widen stale claim reclaim window	2026-05-21 17:06:55 +08:00
iYuan	2f1f90c11a	fix(agent): retry codex semantic inactivity fresh (#2593 )	2026-05-20 20:03:39 +08:00
Angular	1f978bf1ec	feat(autopilot): link created issues to projects (#2908 ) * feat(autopilot): link created issues to projects * test(autopilot): cover project flag	2026-05-20 15:37:23 +08:00
Bohan Jiang	b7082a01f1	fix(issues): retry button targets the row's agent (MUL-2457) (#2921 ) * fix(issues): retry button targets the row's agent, not the assignee (MUL-2457) The execution log retry button used to re-fire the issue's current assignee instead of the agent that actually ran the clicked row. After a reassignment, or for squad workers / @-mention agents, the rerun landed on the wrong agent. POST /api/issues/{id}/rerun now accepts an optional task_id: when set, the rerun targets that task's agent (and reuses its leader/worker role). An empty body keeps the assignee-driven CLI/API contract. The execution-log retry button passes task.id, so per-row retry always fires the correct agent. enqueueMentionTask gained a forceFreshSession parameter so the new mention-path rerun keeps the same fresh-session contract as the assignee path. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): inherit trigger provenance + fix cross-issue test (MUL-2457) Address review feedback on PR #2921: 1. RerunIssue now inherits TriggerCommentID from the source task when sourceTaskID is valid. Without this, a per-row rerun of a comment- or mention-triggered task degrades into a generic issue run because the daemon's buildCommentPrompt path keys on TriggerCommentID. The inherited summary is rebuilt naturally inside the enqueue helpers (buildCommentTriggerSummary derives it from the comment ID). 2. The new cross-issue rejection test inserted a second issue without `number`, hitting uq_issue_workspace_number on a same-workspace collision with the fixture's issue. Both inserts now claim the next available per-workspace number (MAX(number)+1) — matching the pattern used by notification_listeners_test. Added TestRerunIssueInheritsTriggerCommentFromSourceTask to lock the trigger provenance contract. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 15:30:03 +08:00
Jiayuan Zhang	fc8528d64d	feat(autopilot): support assigning to a squad (MUL-2429) (#2888 ) * feat(autopilot): support assigning autopilot to a squad (MUL-2429) Path A (Squad-as-Leader) from the RFC: when an autopilot's assignee is a squad, dispatch resolves to squad.leader_id and executes against the leader's runtime — semantics match a human manually assigning the issue to that squad, no fan-out. Backend scope only; frontend picker change is a follow-up PR. Changes: - 096_autopilot_squad_assignee migration: drop agent FK on autopilot.assignee_id, add assignee_type column (default 'agent'), add autopilot_run.squad_id attribution column. - service.AgentReadiness: single source of truth for archived / runtime-bound / runtime-online checks. Shared by autopilot admission gate, run_only dispatch, and isSquadLeaderReady. - service.resolveAutopilotLeader: translates assignee_type/id to the agent that actually runs the work. - dispatchCreateIssue: stamps issue with assignee_type='squad' for squad autopilots and enqueues via EnqueueTaskForSquadLeader. - dispatchRunOnly: belt-and-braces readiness re-check after resolving squad → leader so a leader that went offline between admission and dispatch produces a clean failure instead of a doomed task. - handler.CreateAutopilot / UpdateAutopilot: accept assignee_type with squad/agent existence + leader-archived validation. Backward-compatible default of "agent" preserves the contract for older clients. - Analytics: AutopilotRunStarted/Completed/Failed events carry assignee_type and squad_id; PostHog can now group autopilot runs by squad without joining back to the autopilot row. Co-authored-by: multica-agent <github@multica.ai> * fix(autopilot): reject archived squads, route post-admission skips, cleanup dangling-agent autopilots (MUL-2429) Addresses three review findings on PR #2888: 1. Archived squad handling: validateAutopilotAssignee now rejects squads with archived_at set; resolveAutopilotLeader returns errSquadArchived so the admission gate fails closed; DeleteSquad now mirrors the issue transfer for autopilot rows (TransferSquadAutopilotsToLeader) so surviving autopilots flip to assignee_type='agent' (leader) instead of dangling at the archived squad. 2. dispatchRunOnly post-admission readiness: introduces errDispatchSkipped sentinel, recognised by DispatchAutopilot via handleDispatchSkip so the run is recorded as `skipped` (not `failed`). Manual triggers no longer 500 when the leader's runtime goes offline between admission and task creation. New TestManualTriggerDoesNotErrorOnPostAdmissionSkip locks the behaviour in. 3. Dangling agent assignee after migration 096 dropped the FK: shouldSkipDispatch now distinguishes pgx.ErrNoRows / errSquadArchived (hard skip — retrying won't help) from transient DB errors (fail-open). DeleteAgentRuntime pauses autopilots that target agents about to be hard-deleted (ListArchivedAgentIDsByRuntime + PauseAutopilotsByAgentAssignees) so the breakage surfaces as a paused row in the UI instead of a quiet skip-burning loop. Unit tests cover the sentinel unwrap contract and errSquadArchived errors.Is behaviour. Integration test TestAutopilotDispatchSkipsWhenRuntimeOffline re-verified against a fresh DB with migration 096 applied. Co-authored-by: multica-agent <github@multica.ai> * fix(autopilot): bump last_run_at on post-admission skip (MUL-2429) Match recordSkippedRun (pre-flight skip) and the success path so the scheduler / "last seen" UI both reflect that this tick evaluated the trigger, even when the post-admission readiness gate caught a late regression. Addresses Emacs review caveat #1 on PR #2888. Co-authored-by: multica-agent <github@multica.ai> * feat(autopilot): mixed agent/squad assignee picker in dialog (MUL-2429) End-to-end UI for assigning an autopilot to a squad. Closes the PR #2888 backend gap: the squad-as-assignee feature was already wired in Go (Path A, RFC §4) but the desktop dialog never offered the choice. - core/types/autopilot: add `AutopilotAssigneeType`, surface `assignee_type` on `Autopilot` + Create/Update request payloads. - views/autopilots/pickers/agent-picker: switch to a polymorphic AssigneeSelection (`{type, id}`); render agents and squads as two grouped sections with shared pinyin search. - views/autopilots/autopilot-dialog: maintain `assigneeType` state, send it on create/update, render the trigger avatar / hover dot with `assignee.type`. - views/autopilots/autopilots-page + autopilot-detail-page: render the assignee row using `autopilot.assignee_type` so squad-typed autopilots show the squad avatar + name, not a broken agent lookup. - locales: add `agents_group` / `squads_group` / `select_assignee` keys (en + zh-Hans), keep legacy `select_agent` for callers that still reference it. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 05:30:13 +02:00
Bohan Jiang	9a577f3e11	fix(runtimes): anchor OpenCode skill + AGENTS.md discovery to task workdir (MUL-2416) (#2849 ) * fix(runtimes): anchor OpenCode skill + AGENTS.md discovery to task workdir OpenCode resolves its project discovery root from `--dir` and `PWD` before falling back to `process.cwd()`. The daemon set `cmd.Dir = workDir` but never overrode the inherited `PWD`, so OpenCode walked from the daemon's shell directory and silently bypassed the per-task workdir — agents lost visibility into `.opencode/skills/` and `AGENTS.md`, falling back to whatever global skills the host had installed (MUL-2416). - Pass `opencode run --dir <workDir>` and override `PWD=<workDir>` in the child env so AGENTS.md walk-up + `.opencode/skills` project config scan both anchor on the task workdir. - Block `--dir` from custom args so user overrides cannot re-introduce the regression. - Plumb skill `description` from DB through service / daemon / execenv. `writeSkillFiles` synthesizes a YAML frontmatter block (`name`, optional `description`) when the stored content lacks one, since runtimes like OpenCode silently drop SKILL.md files without a parseable `name`. Existing frontmatter is preserved unchanged so upstream-imported skills (GitHub / ClawHub / Skills.sh) keep their hand-shaped metadata. Tests: - New fake-CLI test confirms argv carries `--dir <workDir>` and the child sees `PWD=<workDir>`. - New test confirms a user-supplied `--dir` in custom_args is dropped. - New execenv tests cover synthesized frontmatter and preservation of pre-existing frontmatter. Co-authored-by: multica-agent <github@multica.ai> * fix(runtimes): inject SKILL.md `name` when upstream frontmatter omits it Skills imported with frontmatter that sets `description` but leaves `name` implicit (relying on the directory slug, as common in GitHub/Skills.sh imports) still hit OpenCode's "no parseable name → drop" path because the DB Name fallback never made it into the SKILL.md body. ensureSkillFrontmatter now scans the existing block and, when name is missing or empty, prepends `name: <slug>` while preserving description, body, and any runtime-specific keys verbatim. Also tighten yamlEscapeInline to always double-quote so descriptions that look like YAML keywords (`null`, `true`, `[foo]`, `{x: y}`, `2024-01-01`) parse as strings rather than getting reinterpreted and rejected. Adds regression test for the nameless-frontmatter case and updates the existing OpenCode skill test for the always-quoted description format. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-19 16:21:02 +08:00
Bohan Jiang	eabfb8f3d1	fix(autopilots): reject unknown {{...}} tokens in issue title template (MUL-2370) (#2799 ) * fix(autopilots): reject unknown {{...}} tokens in issue title template (MUL-2370) `--issue-title-template` (and the matching `issue_title_template` API field) silently kept any placeholder other than `{{date}}` as a literal string in the rendered issue title — `{{.TriggeredAt}}`, `{{trigger_id}}`, `${date}`, etc. would all slip through `strings.ReplaceAll` unchanged because the renderer only knew one token. The flag name and help text ("Template for issue titles (create_issue mode)") and the docs phrasing ("the title supports interpolation like `{{date}}`") both implied a richer placeholder set existed. Tightens the contract on three fronts: - Reject any `{{...}}` token other than `{{date}}` at create/update time with `unknown template variable %q; supported: {{date}}` — turns the silent-on-trigger surprise into an explicit 400 the moment the user sets the template. - Update CLI flag help on `autopilot create --issue-title-template` and `autopilot update --issue-title-template` to spell out that only `{{date}}` (UTC, YYYY-MM-DD) is interpolated. - Update `apps/docs/content/docs/autopilots{,.zh}.mdx` to drop the "like `{{date}}`" phrasing for the single supported placeholder. Adds service-layer tests covering `interpolateTemplate` (substitution, empty-template fallback, no-placeholder verbatim) and `ValidateIssueTitleTemplate` (accepts empty / plain / `{{date}}` / `{{ date }}`; rejects Go-template, Mustache-style, future placeholders like `{{datetime}}`, and templates that mix one valid and one invalid token). Expanding the placeholder set (`{{datetime}}`, `{{trigger_id}}`, `{{trigger_source}}`) is tracked as a separate enhancement — those need run/trigger context plumbed into the renderer, which is out of scope for this bug fix. Closes #2732 Co-authored-by: multica-agent <github@multica.ai> * fix(autopilots): render {{ date }} whitespace form too (MUL-2370) Validator permitted {{ date }} but interpolateTemplate only matched the exact string {{date}}, so a template that passed create/update could still emit a literal {{ date }} at trigger time — re-introducing the silent-literal behaviour the validator was meant to remove. Route rendering through the same regex as validation so every accepted form is also a substituted form. Cover {{ date }} substitution in TestInterpolateTemplate. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-18 18:12:14 +08:00
Multica Eve	dfe2a57361	fix(autopilots): allow duplicate create_issue runs (#2789 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-18 16:05:54 +08:00
Kerim Incedayi	9418d2a2c1	feat(autopilots): webhook triggers (server + CLI + UI + docs) MUL-2049 (#2348 ) * feat(server): add webhook trigger DB migration + sqlc queries Lays the foundation for webhook autopilot triggers: - partial unique index on autopilot_trigger.webhook_token (kind=webhook only) so the public ingress route can resolve a trigger in O(1) - GetWebhookTriggerByToken / TouchAutopilotTriggerFiredAt / RotateAutopilotTriggerWebhookToken / SetAutopilotTriggerWebhookToken queries, regenerated with sqlc * feat(server): webhook token generator + payload normalizer Two pure helpers for the webhook autopilot work: - generateWebhookToken: 32 random bytes -> base64-url, "awt_" prefix. 256 bits of entropy keeps brute-force off the table; the prefix makes leaked tokens recognisable in logs. - normalizeWebhookPayload: turns arbitrary JSON into the WebhookEnvelope shape (event/eventPayload/request) used by trigger_payload. Header- and body-based event inference covers GitHub, GitLab, X-Event-Type, and caller-provided envelopes; scalar/empty/invalid bodies are rejected so the handler can answer 400. * feat(server): generate webhook tokens and expose rotate endpoint - New handler.Config.PublicURL fed by MULTICA_PUBLIC_URL env so /api/autopilots/.../triggers responses can include an absolute webhook_url alongside the always-present webhook_path. - CreateAutopilotTrigger now mints a webhook_token via crypto/rand for kind=webhook and ignores cron/timezone for non-schedule kinds. api triggers stay accepted-but-inert per PLAN.md. - New POST /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token protected by the existing workspace auth group; old tokens stop working immediately because the unique-index lookup keys on the current row value. * feat(server): public webhook ingress route + per-token rate limiter - New POST /api/webhooks/autopilots/{token} route, mounted outside the authenticated group: the path token is the credential. Workspace context is derived from the joined autopilot row, never headers. - Body capped at 256 KiB via http.MaxBytesReader; oversized payloads return 413 mid-read instead of being fully buffered. - Disabled triggers / paused / archived autopilots return 200 {"status":"ignored"} so providers stop retrying. - Skipped-runtime dispatches surface 200 {"status":"skipped"} with the reason from the autopilot service's pre-flight admission check. - WebhookRateLimiter interface with sliding-window in-memory + Redis Lua-script implementations. Default 60 req/min per token. Test coverage on the in-memory path; Redis variant fails open on cache errors so a Redis hiccup never blocks ingress. - Integration tests exercise token generation, dispatch, payload envelope persistence, GitHub-header inference, paused/disabled short-circuits, oversized rejection, and rotate-then-old-token-404. * feat(server): include webhook payload in create_issue description When an autopilot run is triggered by a webhook and execution_mode is create_issue, the agent only sees the issue body — never the run's trigger_payload. Append a 'Webhook event:' line and a fenced JSON block with the normalized eventPayload so the agent has the inbound context inline. Schedule / manual runs are unchanged. Tests cover: - schedule path keeps existing italic note, no webhook block - webhook path emits event line + payload block, italic before block - non-envelope JSON falls back to raw body (defensive) - non-webhook source with payload still gets no webhook block * feat(core): types, API client and mutations for webhook triggers - AutopilotRunStatus gains 'skipped' so the run-list UI handles the admission-skipped state explicitly instead of falling through to a generic case (the backend already emits it via MUL-1899). - AutopilotTrigger picks up optional webhook_path / webhook_url. Both are optional so older self-hosted servers that pre-date this change still parse cleanly. - buildAutopilotWebhookUrl helper composes a usable absolute URL with the priority webhook_url > apiBaseUrl + path > origin + path > path. Tested with seven cases covering each branch. - ApiClient.rotateAutopilotTriggerWebhookToken posts to /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token; the HTTP-contract test pins URL + method. - useRotateAutopilotTriggerWebhookToken mutation invalidates autopilotKeys.detail on settle, mirroring the existing trigger-mutation pattern. * feat(views): webhook trigger UI in Add Trigger dialog and trigger row Add Trigger dialog gains a Schedule/Webhook segmented toggle: - Schedule reuses TriggerConfigSection unchanged. - Webhook hides the cron config and shows a help line; the trigger is created with kind=webhook and the URL is generated server-side. - Toast text differentiates schedule vs webhook on success. TriggerRow grows a webhook branch: - Webhook icon, kind translated via trigger_kind. - URL shown in a truncating monospace pill, with copy + rotate buttons. Copy uses navigator.clipboard with toast feedback; rotate uses an AlertDialog confirm because the old URL stops working immediately. - api triggers render a Deprecated badge and skip URL/copy/rotate affordances. RunRow gains a 'skipped' RUN_VISUAL entry (muted dash) so admission- skipped runs don't fall through to a generic case. Source label uses the new run_source i18n key instead of capitalize. Locales: en + zh-Hans gain run_status.skipped, run_source., trigger_kind., trigger_row.{copy_url,rotate_url,_confirm_,toast_}, add_trigger_dialog.{type_,webhook_help,toast_added_{schedule,webhook}}. * feat(cli): support webhook trigger creation and URL rotation - multica autopilot trigger-add now takes --kind schedule\|webhook (default schedule for backward compatibility). For webhook it skips --cron / --timezone validation and prints the resulting webhook URL, preferring the server-provided webhook_url and falling back to client.BaseURL + webhook_path. - New multica autopilot trigger-rotate-url <autopilot-id> <trigger-id> command for rotating the bearer URL of a webhook trigger. * docs(autopilots): add webhook trigger guide (en + zh) Replaces the 'Webhook and API triggers are not available yet' section with end-to-end webhook documentation: how the URL is generated, what payload shapes are accepted, the inferred-event rules, the bearer-secret warning + rotate flow, status-code semantics for accepted/skipped/ ignored/4xx/5xx outcomes, and the MULTICA_PUBLIC_URL self-host configuration. Run history list now mentions skipped status. The 'unavailable features' section narrows to api-kind triggers, HMAC signing, IP allowlists, and provider presets. * feat(views): add Schedule/Webhook toggle to the create autopilot dialog Closes the gap where a brand-new autopilot could only be created with a schedule trigger. The right-column config now has a Trigger section with a segmented Schedule/Webhook control: - Schedule keeps the existing cron/timezone UI. - Webhook hides the cron UI and shows a help line; on submit, a kind=webhook trigger is created right after the autopilot. In edit mode the toggle is intentionally hidden (PLAN.md treats trigger- type changes as delete-old + create-new, not in-place updates), but the panel still picks the right kind based on props.triggers[0].kind so a webhook autopilot doesn't render an irrelevant cron form. Locales: section_trigger_kind, trigger_kind_{schedule,webhook}, section_webhook, webhook_help_{create,edit} added in en + zh-Hans. * feat(views): show webhook URL inline after creating a webhook autopilot After a successful create with kind=webhook, the dialog stays open and swaps to a confirmation panel showing the freshly minted URL with a copy button + 'Treat this URL like a password' warning + Done button. Avoids the friction of "create the autopilot, then go find it in the list, click in, scroll to triggers, copy URL." Locales: dialog.webhook_created_{title,description,warning,done} added in en + zh-Hans. Schedule create flow is unchanged (toast + close). The success panel is gated on the trigger returned from the create mutation, so a partial failure (autopilot created, trigger creation errored) still falls through to the toast_create_partial path. * feat(views): show webhook payload in run detail dialog The agent transcript dialog now accepts an optional headerSlot that sits above the event list. The autopilot RunRow drops a WebhookPayloadPreview into that slot when the run came from a webhook and trigger_payload is non-empty. The preview is collapsed by default (the transcript itself is the main event), shows the inferred event name + receivedAt in the header, and reveals the eventPayload as pretty-printed JSON with a copy button on expand. Falls back gracefully if the row's trigger_payload doesn't match the WebhookEnvelope shape — the whole value is shown instead so nothing is hidden. Closes the "agent didn't echo the payload, now I can't see what triggered the run" gap. PLAN.md tracked this as "Payload preview in run history" under follow-ups. Locales: webhook_payload.{label, unknown_event, payload, content_type, copy, copied, copied_short, copy_failed} added in en + zh-Hans. * chore(server): wire MULTICA_PUBLIC_URL through self-host compose Two small follow-ups split out of the webhook trigger PR: - docker-compose.selfhost.yml passes MULTICA_PUBLIC_URL into the backend container so a self-hosted deployment behind a real domain gets absolute webhook URLs in the trigger response. Documented in .env.example with the rationale for not deriving the public host from request headers. - Drop a duplicated 'invalid json:' prefix in the webhook ingress 400 error path. normalizeWebhookPayload already prefixes its errors, so the handler doesn't need to re-prefix. * fix(migrations): renumber webhook trigger migration 081 → 089 to avoid collision The branch's 081_autopilot_webhook_triggers.{up,down}.sql collided numerically with 081_runtime_timezone.{up,down}.sql that landed on main, making migration apply order undefined. Renumber to 089 so the file slots after the latest main migration (088_squad_instructions). The SQL itself doesn't conflict — it only creates a partial unique index on autopilot_trigger.webhook_token — but the duplicate prefix is what the migration runner sees, so the filename must move. * fix(autopilot-webhook): address PR review blocking issues - Redact bearer tokens from request logs: paths matching /api/webhooks/autopilots/<token> now log "[redacted]" instead of the token. The resolved trigger ID is plumbed via context so audit lines stay useful for debugging. (Review item Blocking #1.) - Distinguish pgx.ErrNoRows from transient DB errors in token lookup: no-row stays 404 (so providers don't retry on a deleted webhook), other errors return 500 (which providers DO retry, avoiding silent drops on DB blips). (Review item Blocking #2.) - Add per-IP sliding-window rate limiter that runs BEFORE the token lookup, so spraying random tokens can no longer probe the autopilot_trigger index unboundedly. Reuses the existing Lua script with a separate Redis key namespace; falls open on Redis errors. Default budget 30 req/min/IP. (Review item Blocking #3.) The webhook handler now applies the gates in the order: per-IP rate limit → token lookup → per-token rate limit → handler logic. * fix(autopilot): atomic webhook trigger creation + strict kind/timezone validation - Mint the webhook bearer token BEFORE the INSERT and pass it via CreateAutopilotTriggerParams so the row never exists in a half-written kind=webhook + webhook_token=NULL state. On the (vanishingly rare) unique-index collision the whole INSERT is retried with a fresh token — no UPDATE second step. Removes the now-dead attachFreshWebhookToken helper. (Review item Recommended #4.) - Add new GET /api/autopilots/{id}/runs/{runId} endpoint that returns a single run including the full trigger_payload. The list response is now slim (omits trigger_payload) so worst-case payload size drops from ~5 MB to ~5 KB. (Review item Recommended #5, server side.) - Reject kind=api with 400 ("kind=api is deprecated; use schedule or webhook") and reject kind=webhook with --timezone with 400 — both surfaces stragglers loudly instead of silently dropping fields. CLI mirrors the check so --timezone with --kind webhook errors client-side. (Review nits.) - Add --yes (-y) flag and an interactive y/N confirmation prompt to `multica autopilot trigger-rotate-url` so the destructive rotate matches the UI's AlertDialog safety. (Review item Recommended #6.) * fix(views): fetch webhook payload on-demand and truncate at 4 KiB - Add useAutopilotRun query hook + getAutopilotRun API client method paired with the new server endpoint. The run-detail dialog now mounts a WebhookPayloadSlot that fetches the full run (incl. trigger_payload) lazily — list responses no longer carry up to 256 KiB × N runs of envelope data. - WebhookPayloadPreview truncates its in-DOM <pre> at 4 KiB with a localized marker so jank-y machines aren't asked to render a 256 KiB JSON blob. The Copy button still yields the full string. - Adds the truncated_marker i18n string to en + zh-Hans. Review items Recommended #5 (frontend) and a nit on the preview's unbounded <pre>. * test(autopilot-webhook): close coverage gaps flagged in PR review - request_logger: redactWebhookPath unit tests + integration test proving the bearer token never lands in slog output, plus the webhook_trigger_id context plumbing. - autopilot_webhook_handler: empty body → 400, archived autopilot → 200 ignored, per-IP rate limiter trips before DB lookup, kind=api and webhook+timezone are rejected at 400, slim list + full detail endpoint round-trip. - webhook_rate_limiter: Lua script structure guard (catches reordering even without a live Redis), plus live-Redis tests for both per-token and per-IP limiters (REDIS_TEST_URL gated, matching the existing Redis test pattern in the package). - WebhookPayloadPreview: envelope rendering, fallback shape, and the >4 KiB truncation path with full-payload-on-Copy guarantee. Two branches are documented as code-review-protected rather than covered by tests: the 500-on-DB-error path requires injecting a stub Queries (no interface here), and the cross-workspace defense-in-depth check is unreachable from valid SQL state. * fix(middleware): SetWebhookTriggerID must mutate request in place The round-1 helper returned a fresh http.Request from WithContext, and the webhook handler did `r = SetWebhookTriggerID(r, ...)`. That swaps the handler's local pointer but doesn't propagate the new context back to RequestLogger, which is still holding the original http.Request — so the audit line never actually included webhook_trigger_id in production. The round-1 test happened to pass because it pre-stashed the value on the request before calling ServeHTTP, bypassing the bug it was meant to verify. Switch to in-place mutation via `r = r.WithContext(...)` so the wrapping middleware sees the new context after next.ServeHTTP returns, and update the test to exercise the real call pattern (set the context from inside the handler, assert the surrounding logger reads it). Verified live: an accepted webhook now logs path=/api/webhooks/autopilots/[redacted] webhook_trigger_id=<uuid> * fix(autopilot-webhook): symmetric ErrNoRows split + trusted-proxy gate Round-2 review (Bohan-J, PR #2348 follow-up): - Must-fix #1: the second lookup at autopilot_webhook.go:258 (GetAutopilot after the token resolves) was folding every error into 404. A transient DB blip would tell a webhook sender "not found" and it would never retry. Apply the same errors.Is(err, pgx.ErrNoRows) → 404 / else → 500 split as the first lookup got in round 1. - Must-fix #2: clientIPForRateLimit was honoring X-Forwarded-For / X-Real-IP from any caller. An attacker spraying random tokens could just rotate the XFF header and the per-IP bucket became per-request, so the limiter that's specifically supposed to gate spraying before it hits the DB unique index was bypassed. New shape — matches Bohan's suggestion exactly: * Default: r.RemoteAddr only, headers ignored. * Operator opt-in via MULTICA_TRUSTED_PROXIES (comma-separated CIDRs). XFF/X-Real-IP are honored only when r.RemoteAddr is inside one of the listed prefixes; otherwise they're dropped. Wired through .env.example and docker-compose.selfhost.yml so self-host operators can configure their reverse-proxy's CIDR. Invalid CIDRs in the env var are dropped with a single slog.Warn at startup rather than crashing the server. Uses net/netip (stdlib, value-typed) for parsing and containment checks. Verified live on the rebuilt self-host backend: a 35-request spray from one source with rotating XFF gets the expected 30× 404 + 5× 429, proving the per-IP bucket is keyed on the real connection IP. * fix(autopilot): reject cron/timezone PATCH on non-schedule triggers Round-2 review should-fix. CreateAutopilotTrigger already 400s on kind=webhook + timezone/cron_expression, but UpdateAutopilotTrigger silently wrote those fields regardless of prev.Kind. The values then sat in the DB visible to nobody and read by nothing — a back door that left the API contract fuzzy across create vs update. Mirror the create-path discipline: after loading prev, if prev.Kind != "schedule" and the PATCH body sets cron_expression or timezone, return 400 with a clear message. enabled and label remain accepted on every kind. The existing prev.Kind == "schedule" guard on next_run_at recompute stays as belt-and-braces, but with this gate in place the recompute branch is now reachable only for the kind it was meant for. * test(autopilot-webhook): close round-2 coverage gaps - IPRateLimitNotBypassedByXFFSpoof: drives the must-fix #2 invariant by rotating XFF across three calls from the same RemoteAddr and asserting the third gets 429. Pre-round-2 this test would have passed for the wrong reason (limiter trusted XFF, so per-bucket collision was incidental); now it pins the bypass-closed property. - IPRateLimitReturns429BeforeDBLookup: updated to set RemoteAddr explicitly and drop the XFF header it was leaning on. With TrustedProxies empty (test default) the limiter keys on the real connection IP, which is what the test wants to assert anyway. - UpdateAutopilotTrigger_RejectsCronExpressionOnWebhookKind + UpdateAutopilotTrigger_RejectsTimezoneOnWebhookKind: drive the round-2 should-fix from the handler boundary. - UpdateAutopilotTrigger_AcceptsEnabledAndLabelOnWebhookKind: counter test so a regression to a blanket reject is caught. * fix(migrations): bump webhook trigger migration 089 → 091 origin/main added 089_squad_no_action_activity_index (and 090_task_is_leader) since our last rebase, re-colliding with our 089_autopilot_webhook_triggers. Bump to 091 so the filename ordering is unambiguous again. The SQL is unchanged — same partial unique index on autopilot_trigger.webhook_token — only the filename moves. * fix(views): dedupe skipped icon in autopilot RUN_VISUAL after rebase The rebase against origin/main merged main's add of `Ban` for the skipped status next to our round-1 `MinusCircle` entry, leaving the RUN_VISUAL map with two `skipped` keys (only the last would have been read at runtime, and MinusCircle had been dropped from the imports during conflict resolution — so the file would not compile). Keep main's `Ban` icon (latest design) and a single `skipped` entry. Carry over the round-1 comment about why the muted styling matters for failure-ratio readability. --------- Co-authored-by: Kerim Incedayi <kerim.incedayi@digitalchargingsolutions.com>	2026-05-18 12:17:39 +08:00
Bohan Jiang	3645bdb5b6	feat(issues): add start_date field with progressive disclosure (MUL-2274) (#2696 ) * feat(issues): add start_date field with progressive disclosure (MUL-2274) Mirrors the existing due_date implementation end-to-end so an issue can express a planned start in addition to a deadline. Surfaces start_date as an optional sidebar property alongside priority / due_date / labels (added in MUL-2275), with consistent picker, board/list/sort, activity, and inbox plumbing. Backs the Project Gantt work (parent MUL-1881) and keeps the progressive-disclosure attribute experience consistent. - DB: migration 091 adds issue.start_date TIMESTAMPTZ. - sqlc: ListIssues / CreateIssue / UpdateIssue / CreateIssueWithOrigin / ListOpenIssues read & write start_date. - Backend: IssueResponse + create/update/batch-update handlers parse and emit start_date with RFC3339 validation; new start_date_changed activity event + subscriber notification (with prev_start_date in event payload). - CLI: --start-date flag on `multica issue create` / `issue update`. - Frontend: StartDatePicker component, start_date wired into Issue type, Zod schema, draft / view stores, sort util, header sort + card-property options, list-row / board-card display, create-issue modal, and the issue-detail progressive-disclosure "+ Add property" surface (visibility rule, picker row, add-property menu icon + label). - i18n: en + zh-Hans for sort_start_date / card_start_date / prop_start_date / activity start_date_set / start_date_removed / picker start_date.trigger_label / clear_action / inbox labels. - Tests: new TestNotification_StartDateChanged; existing Issue / draft / modal fixtures extended with start_date. Co-authored-by: multica-agent <github@multica.ai> * feat(issues): align start_date with due_date in actions menu and CLI table - Add Start Date submenu (today / tomorrow / next week / clear) in actions menu, mirroring Due Date — parity with the Due Date quick setters in list/board context and 3-dot menus. - Add corresponding en / zh-Hans i18n keys (actions.start_date / start_today / start_tomorrow / start_next_week / start_clear). - CLI human table for `multica issue list` and `multica issue get` now shows a START DATE column next to DUE DATE; --full-id variant too. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-17 15:01:38 +08:00
Jiayuan Zhang	4c7a990a25	fix(autopilot): attribute autopilot-created issue to assignee agent (MUL-2293) (#2719 ) Before: dispatchCreateIssue copied autopilot.created_by_type/id onto the new issue's creator_type/creator_id, and the same fields were used as the ActorType/ActorID of the issue:created event. Result: any issue spawned by an autopilot was reported as created by the human who first configured the autopilot, not by the agent that actually owns the work. Downstream subscriber/activity/notification listeners inherited the same wrong actor. After: creator and actor are both the autopilot's assignee agent (creator_type=agent, creator_id=ap.assignee_id). The human owner is still recoverable via origin_type=autopilot + origin_id. Audited the other ap.created_by_* usages: analytics attribution (autopilotActorID, task.go user-id), and the private-agent visibility gate in shouldSkipDispatch — all correctly read the autopilot's owner, not the executor, so they stay as-is. Co-authored-by: multica-agent <github@multica.ai>	2026-05-16 09:32:15 +02:00
iYuan	d8635ad580	fix(issues): prevent duplicate active issue creation (MUL-2225) (#2602 ) * fix: prevent duplicate active issue creation * fix(issues): address duplicate guard review * fix(autopilot): skip duplicate issue admissions * fix(issueguard): tighten duplicate lookup edge cases * test(issues): cover duplicate guard autopilot skips * feat(autopilots): group skipped runs in history	2026-05-15 18:27:56 +08:00
LinYushen	b7a58c06ac	Revert "feat(task): wire claim lease into TaskService and sweeper (MUL-2246) …" (#2673 ) This reverts commit `bb32be0e50`.	2026-05-15 16:06:58 +08:00
LinYushen	bb32be0e50	feat(task): wire claim lease into TaskService and sweeper (MUL-2246) (#2662 ) * feat(task): wire claim lease queries into TaskService and sweeper (MUL-2246) - ClaimTask now uses ClaimAgentTaskWithLease (generates claim_token + lease) - StartTask accepts optional claim_token for token-verified start - AgentTaskResponse includes claim_token for daemon to use - Daemon client sends claim_token in StartTask body - Sweeper calls RequeueExpiredClaimLeases each tick - Legacy daemons without claim_token still work (graceful fallback) Co-authored-by: multica-agent <github@multica.ai> * fix(task): address PR #2662 review blockers (MUL-2246) 1. ClaimAgentTaskForRuntime: push runtime_id into atomic SQL WHERE clause so runtime A cannot claim tasks queued for runtime B under the same agent. 2. Legacy StartAgentTask: add claim_token IS NULL guard so leased rows cannot be started without token verification. Handler rejects malformed tokens with 400 instead of silently degrading to legacy path. 3. StartAgentTaskWithClaimToken: validate claim_expires_at >= now(), preserve claim_token until terminal state (only clear claim_expires_at), use CTE + UNION ALL for idempotent retry when daemon resends after a lost StartTask response. Return 409 Conflict on token mismatch/expiry. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): StartTask 409 handling, transport retry, claim_token on FailTask (MUL-2246) - StartTask 409 (claim superseded): release slot, don't call FailTask - StartTask transport timeout/5xx: retry once with same token, then check task status before failing - FailTask now sends claim_token; server-side FailAgentTask SQL adds AND (claim_token IS NULL OR claim_token = @claim_token) guard so stale daemons cannot fail tasks that have been re-claimed Co-authored-by: multica-agent <github@multica.ai> * fix(task): close FailTask token bypass and RequeueExpiredClaimLeases liveness gap (MUL-2246) Blocker 1 - FailTask token validation: - SQL: change (param IS NULL OR claim_token = param) to (param IS NULL AND claim_token IS NULL) OR claim_token = param so tokenless requests can only fail legacy (tokenless) rows. - task.go: malformed claim_token now returns ErrInvalidClaimToken (400) instead of being silently dropped to NULL. - Handler: maps ErrInvalidClaimToken→400, ErrClaimTokenInvalid→409. - Service: when UPDATE returns no rows but task is still active, return ErrClaimTokenInvalid (token mismatch) instead of silent success. Blocker 2 - RequeueExpiredClaimLeases runtime liveness: - SQL: JOIN agent_runtime, only requeue tasks where runtime is 'online'. Dead/offline runtime tasks stay dispatched for FailTasksForOfflineRuntimes. - FOR UPDATE → FOR UPDATE OF atq (required with JOIN). Regression tests: - task_claim_token_test.go: malformed, tokenless-on-tokened, wrong-token - requeue_lease_test.go: SQL must JOIN agent_runtime with online filter Co-authored-by: multica-agent <github@multica.ai> * fix(task): move expired lease requeue to ClaimTaskForRuntime preflight, add heartbeat freshness backstop (MUL-2246) - Add RequeueExpiredClaimLeasesForRuntime: per-runtime preflight self-requeue in ClaimTaskForRuntime. Runtime proves liveness by actively claiming, so no heartbeat check needed. - Update global RequeueExpiredClaimLeases to require ar.last_seen_at freshness (stale_threshold_secs param). Prevents requeuing to a dead runtime in the 90s gap between lease expiry (60s) and offline detection (150s). - Add regression tests verifying the heartbeat freshness check and that the preflight query does not join agent_runtime. Co-authored-by: multica-agent <github@multica.ai> * fix(task): use LivenessStore for global requeue, move preflight before empty-cache (MUL-2246) Blocker 1: Global RequeueExpiredClaimLeases now uses LivenessStore.IsAliveBatch to verify runtimes are truly alive before requeuing expired leases. When LivenessStore is unavailable (no Redis), global requeue is skipped entirely — the preflight self-requeue in ClaimTaskForRuntime handles live runtimes. This closes the 60-150s gap where a dead runtime still appears online in DB. Blocker 2: Moved RequeueExpiredClaimLeasesForRuntime BEFORE EmptyClaim.IsEmpty fast-path in ClaimTaskForRuntime. Expired leases are now requeued (which bumps the empty cache via notifyTaskAvailable) before the empty check can short-circuit the claim path. Also adds ListRuntimesWithExpiredClaimLeases SQL query and LivenessChecker interface on TaskService. Co-authored-by: multica-agent <github@multica.ai> * fix(task): wire EmptyClaimCache into backend taskSvc for backstop requeue (MUL-2246) The backend taskSvc used by the sweeper only had Liveness wired but not EmptyClaim. When global backstop requeue called notifyTaskAvailable, s.EmptyClaim.Bump() was a nil no-op — the handler's empty-cache was never invalidated, so the daemon's next claim hit a stale empty verdict. Fix: wire the same Redis-backed EmptyClaimCache into the backend taskSvc in main.go (same Redis keys as router.go:139 handler instance). Add regression test verifying backstop requeue invalidates the handler's empty-cache. Co-authored-by: multica-agent <github@multica.ai> * fix(task): global backstop must not requeue — alive runtimes use preflight, dead stay dispatched (MUL-2246) - RequeueExpiredClaimLeases is now a no-op (returns 0 always) - Alive runtimes self-requeue via ClaimTaskForRuntime preflight - Dead runtimes stay dispatched for FailTasksForOfflineRuntimes - Rewriting to queued on dead runtime creates 2h blackhole (offline sweeper only handles dispatched/running) - Test actually calls RequeueExpiredClaimLeases and asserts 0 in all cases Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): remove duplicate usage reporting block after merge conflict (MUL-2246) The merge resolution introduced a second ReportTaskUsage call after the status check, duplicating the usage-before-early-return block that already runs right after runner.run. Remove the duplicate and add a regression test asserting /usage is called exactly once on the normal completion path. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-15 15:15:31 +08:00
Bohan Jiang	a23856bae3	MUL-1624 docs(email): clarify 888888 is opt-in; document SMTP option (#2666 ) * docs(email): clarify 888888 is opt-in via MULTICA_DEV_VERIFICATION_CODE; document SMTP option in self-host docs The startup log line, .env.example, and SELF_HOSTING_ADVANCED.md still implied that the dev master code 888888 is auto-active whenever APP_ENV != "production". That has not been true since the master code was gated behind MULTICA_DEV_VERIFICATION_CODE — the fixed code is disabled by default and must be opted in explicitly. Also extend the docs site with the SMTP relay backend added in #1877: auth-setup, environment-variables, and self-host-quickstart now cover both Resend and SMTP options in EN and ZH. Co-authored-by: multica-agent <github@multica.ai> * docs(email): treat SMTP as an email backend in self-host docs and startup warning Address review feedback on #2666: - server: startup warning now fires only when both RESEND_API_KEY and SMTP_HOST are empty, since either one is a valid email backend. Otherwise the log mis-tells SMTP-only operators that verification codes go to stdout. - self-host-quickstart (EN/ZH): tell readers to fetch the verification code from whichever backend they configured (Resend or SMTP); fall back to stdout only when neither is configured. - auth-setup (EN/ZH): \"without Resend\" → \"without any email backend configured\" so the wording stays correct now that SMTP is a first-class option. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-15 14:18:46 +08:00
apollion69	35e9a7f0f6	feat(email): add SMTP relay as alternative to Resend for self-hosted deployments (#1877 ) * feat(email): add SMTP relay as alternative to Resend Self-hosted deployments often run behind a corporate firewall with an existing SMTP relay (Exchange, Postfix, sendmail) and no access to external SaaS APIs. Resend requires a public domain, an API key, and outbound HTTPS to api.resend.com — all unavailable in air-gapped or private-network setups. This adds a second email delivery path using Go's stdlib net/smtp, activated when SMTP_HOST is set. Priority order: 1. SMTP relay (SMTP_HOST set) 2. Resend API (RESEND_API_KEY set) 3. DEV stdout (neither set) New env vars (all optional, no breaking change): SMTP_HOST — SMTP server hostname SMTP_PORT — port, default 25 SMTP_USERNAME — for authenticated SMTP; empty = unauthenticated relay SMTP_PASSWORD — used only when SMTP_USERNAME is set SMTP_TLS_INSECURE — set to "true" to skip TLS cert verification (for private CA / self-signed certs) The implementation: - Dials TCP, creates smtp.Client manually (avoids smtp.SendMail which does not expose TLS config) - Tries STARTTLS if advertised; uses InsecureSkipVerify only when SMTP_TLS_INSECURE=true (opt-in, nolint:gosec annotated) - Applies PlainAuth only when SMTP_USERNAME is non-empty - Wraps all errors with context for easier debugging - Reuses existing HTML templates from buildInvitationParams for invitation emails (no template duplication) Also updates .env.example and docker-compose.selfhost.yml with the new variables and inline documentation. * fix(email): add dial timeout, session deadline, RFC headers for SMTP path Address review blockers from multica-eve and Bohan-J (PR #1877): - net.Dial → net.DialTimeout(10s) + conn.SetDeadline(30s) so a blackholed SMTP relay cannot hang SendVerificationCode (called synchronously from the auth handler) or leak goroutines in the invitation path. - Add Date, Message-ID, and proper Content-Transfer-Encoding headers. Date is required by RFC 5322; many strict relays reject messages without it. Message-ID aids deliverability and threading. - MIME-encode Subject via mime.QEncoding so non-ASCII workspace/inviter names (CJK, emoji) survive without corruption across any RFC 2047-conformant relay. - Probe 8BITMIME after (possible) STARTTLS: use Content-Transfer-Encoding 8bit when the relay advertises 8BITMIME, quoted-printable otherwise — safe for all relay configurations without forcing base64 overhead. - Update SELF_HOSTING_ADVANCED.md to document Option B (SMTP relay) alongside the existing Resend section, including all five env vars and a note that port 465/SMTPS is not yet supported. * fix(email): correct has8Bit assignment order (bool is first return of Extension)	2026-05-15 13:35:01 +08:00
Jiayuan Zhang	4d6b5ad06f	fix(squad): wake leader when dual-role agent posts as worker (MUL-2218) (#2626 ) * fix(squad): wake leader when dual-role agent posts as worker (MUL-2218) The squad-leader self-trigger guard skipped a comment whenever the author equalled the squad's leader id, regardless of the role the agent was acting in. For an agent that holds both leader and worker roles in the same squad, this meant the leader role never reacted to its own worker output and the issue stalled. Tag each enqueued task with is_leader_task and consult the agent's most recent task on the issue from both self-trigger guards (comment path + @squad mention path) — skip only when that task was itself a leader task. Co-authored-by: multica-agent <github@multica.ai> * fix(squad): inherit is_leader_task on retry task clone (MUL-2218) CreateRetryTask cloned a parent task into a fresh queued attempt but omitted is_leader_task from the column list, so the child silently fell back to the column default (false). For a leader task that hit auto-retry through MaybeRetryFailedTask, the retried task posed as a worker task — the self-trigger guard then no longer recognised the leader's own comments, re-opening the very loop MUL-2218 closes. Inherit p.is_leader_task in the clone and add a query-level test that covers both leader and worker retries. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 15:23:36 +02:00
LinYushen	0cb759b446	fix(squad): suppress no-action leader comments (#2583 )	2026-05-14 14:07:26 +08:00
fr00st	cc9fbd3db0	Fix stale Done replies on comment follow-ups (#2495 ) * fix: avoid stale done replies on comment follow-ups * fix: avoid inlining runtime brief for Hermes ACP * fix: address comment follow-up review feedback	2026-05-14 12:00:04 +08:00
Bohan Jiang	0345285b86	feat(quick-create): searchable actor picker + squad support (#2552 ) * feat(quick-create): searchable actor picker + squad support (MUL-2163) - Replaces the flat agent dropdown in the "Create with agent" modal with a searchable PropertyPicker that lists Agents and Squads in separate sections, so users can filter by name and pick a squad as the creator. - Persists the selection as (lastActorType, lastActorId), removing the agent-only lastAgentId field on the quick-create store. - Adds squad_id to the quick-create API request and stamps it onto the task's QuickCreateContext. The handler resolves the squad to its leader agent (re-using validateAssigneePair) and the daemon claim path injects the squad-leader briefing when the task carries a squad hint, matching the behavior of issue-bound squad tasks. Co-authored-by: multica-agent <github@multica.ai> * fix(create-issue): forward squad picks across manual→agent switch Manual mode → agent mode previously only carried `agent_id`, so picking a squad and then flipping to agent silently fell back to the persisted actor / first visible agent and lost the user's choice. Carry `squad_id` on the same branch so the agent panel honors the squad pick. Adds a sibling test alongside the existing project-carry case. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 22:31:17 +08:00

1 2 3

136 Commits