Address review: the Cmd/Ctrl+W path reused closeActiveTab(), an
unconditional force-close reserved for route-crash recovery, so the
shortcut could close a pinned tab or reseed the sole tab — both of which
the TabBar deliberately hides the close button for. Add a guarded
closeActiveTabIfClosable() that no-ops for pinned / only tabs and wire
the shortcut to it. (MUL-2987)
Co-authored-by: multica-agent <github@multica.ai>
Cmd/Ctrl+W was bound to the OS application-menu "Close Window" role,
tearing down the whole window and every tab. handleAppShortcut now
swallows Cmd/Ctrl+W and dispatches a close-active-tab action over a new
IPC channel; the renderer's DesktopShell calls the existing
tab-store closeActiveTab(). (MUL-2987)
Co-authored-by: multica-agent <github@multica.ai>
* fix(sidebar): hide stale pinned items immediately on workspace switch
When the user switches workspaces, previously pinned items from the old
workspace were briefly visible until the new workspace data loaded. Reset
the pinned list to an empty array on workspace-id change so the stale
items disappear instantly, eliminating the flicker.
* fix: separate pinnedItems and wsId effects to prevent drag-sort loss
When wsId changes, the combined effect would trigger even if pinnedItems
hadn't changed yet (still old workspace data), overwriting localPinned
and losing any pending drag-sort results.
Split into two independent effects:
- pinnedItems effect: updates localPinned when data changes (respects isDragging)
- wsId effect: updates localPinnedWsId immediately on workspace switch
This ensures workspace switches don't interfere with drag operations.
The Lark short-window debounce (MUL-2968, #3742) can land several user
messages in a chat session before a single agent run fires. But the
daemon claim built the agent prompt from only the *single most recent*
user message (walk history backward, take first user message, break).
So 「看上海天气」then「还有青岛」debounced into one run, and the agent
received only 「还有青岛」— it answered Qingdao and never saw Shanghai.
The session itself was correct (both messages persisted); the gap was in
what the run delivered to the agent. Before debouncing this was masked
because each message got its own run.
Build the prompt from the whole unanswered set instead: the trailing run
of user messages after the last assistant reply (every completed/failed
run writes an assistant row, so the anchor advances one turn at a time —
the full burst on the first turn, only the new message(s) after a reply).
Attachments are collected from each included message. Extracted the
selection into a pure trailingUserMessages helper with table-driven unit
tests, plus a DB-backed claim test asserting both messages reach the
agent and that a post-reply message delivers alone.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(desktop): surface expired login instead of silent "Starting" daemon (MUL-2973)
When the local daemon's cached PAT is expired/revoked, the daemon 401s during
startup and exits before it serves /health. The desktop polled /health forever
and kept reporting "starting", so the runtime sat at "Starting…" with no hint
that re-login was the fix (GitHub #3512).
Detect this in the layer that owns the daemon's credential: when a start fails
to reach "running", probe the token against GET /api/me. A 401 (or missing
token) surfaces a new "auth_expired" daemon state; a 2xx means the token is
fine (non-auth failure) and a network error stays inconclusive — so a network
blip is never misclassified as expired login.
The desktop then shows a "Sign-in expired · Sign in again" prompt on the
runtimes card and a banner in Daemon settings. The action drops the stale
cached PAT, re-mints a fresh one from the current session, and restarts the
daemon; if minting also 401s (the session token is dead) it falls back to the
standard re-login flow. No daemon/CLI behavior change.
Co-authored-by: multica-agent <github@multica.ai>
* fix(desktop): only force re-login on a real 401 during daemon reconnect (MUL-2973)
Review feedback: the reconnect helper treated any failure from clearToken /
syncToken / restart as "session is dead" and logged the user out. A transient
failure (mint 5xx, network blip, config write error, restart hiccup) would
wrongly sign them out.
Move the failure classification into the main process, where the real HTTP
status is available: mintPat now tags its error with the response status, and a
new daemon:reauthenticate handler returns a structured ReauthResult — `ok`,
`session_invalid` (a genuine 401 → the session token itself is dead), or
`transient`. The renderer only calls logout() on `session_invalid`; transient
failures keep the user signed in and show a retryable toast. An unexpected IPC
error is also treated as transient, never as logout.
Add tests locking the classifier (401 → auth, 5xx/network/IO → not auth) and the
renderer behavior (transient failure and IPC throw do NOT log out).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
A forwarded transcript plus a follow-up note arrive as two separate Lark
messages, each of which synchronously called EnqueueChatTask — so the bot
ran twice (once on the bare forward, before the note arrived). The chat
task already reads the whole session history at run time, so the messages
never needed stitching; only the run TRIGGER did.
Introduce pendingBatcher: a per-chat_session debouncer that collapses a
burst into one agent run on a 3s silence window. Each message is still
appended, deduped, and ACKed synchronously and individually; step 8 of the
dispatcher now schedules a debounced flush instead of enqueuing inline.
Because EnqueueChatTask's agent-offline / agent-archived verdict is now
only known at flush, the dispatcher emits that notice itself via an
injected FlushReply (wired to OutcomeReplier.Reply) rather than returning
it synchronously to the hub. Infra failures are logged, not surfaced — the
inbound frame was ACKed long ago. The hub drains the batcher on graceful
shutdown so a normal restart does not drop a pending window.
Out of scope (owner-aligned): group-chat multi-speaker batching, restart
recovery for the in-process window, and forwarded-sender real-name
resolution.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Expand an inbound Lark bot message's body before dispatch with the context
a user explicitly attached, so the agent sees a semantically complete
conversation instead of a bare "@bot 总结一下".
- post: flatten rich-text (title + paragraphs, links, @-mentions) to plain
text synchronously in the decoder.
- merge_forward: inline the forwarded transcript via a single GetMessage —
GET /open-apis/im/v1/messages/{id} returns the forward sentinel plus the
bundled children. (The issue's container_id_type=merge_forward query is
undocumented; this avoids it and also handles a forwarded quoted parent.)
- quoted reply: prepend the parent_id message as a <quoted_message> block;
a parent that is itself a forward nests a <forwarded_messages> block.
- new InboundEnricher runs in the WS connector between decode and emit,
bounded by EnrichTimeout and degrading to "[unable to fetch]" placeholders
so it never blocks the ~3s long-conn ACK budget.
/issue stays parseable on a quote-reply by parsing the command from the
user's own text (CommandBody) rather than the enriched body.
Short-window debounce batching (issue item #4) is tracked as a follow-up.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* chore(analytics): stop shipping operational events to PostHog (MUL-2967)
Operational / execution-lifecycle telemetry dominated PostHog event volume
and drove the bill: runtime_offline alone was ~54% of ~22.6M events/mo, and
~99% of events were billed at the higher identified-event rate. These signals
already have Prometheus counters (Grafana), so the PostHog copies were
redundant cost.
- Add analytics.IsMetricsOnly; metrics.RecordEvent now skips the PostHog
Capture for runtime_* and autopilot_run_* while still incrementing their
Prometheus counter (their analytics.Event constructors are retained to feed
the metric label set via IncForEvent).
- Remove the agent_task_* PostHog path entirely: drop captureTaskEvent and the
AgentTask* constructors/constants. Their Prometheus side is unchanged via the
typed BusinessMetrics.RecordTask* methods. Also remove the now-dead
taskDurationMS / willRetryTask helpers.
- Update the pairing lint test (no agent_task allow-list, no naked-Capture
exception), add a RecordEvent skip test + IsMetricsOnly test, and update
docs/analytics.md (taxonomy, per-event banners, reconciliation).
Product/funnel events (signup, onboarding, issue_created, issue_executed,
chat_message_sent, agent_created, autopilot_created, etc.) are unchanged and
still ship to PostHog.
Co-authored-by: multica-agent <github@multica.ai>
* docs(analytics): correct agent_task Prometheus metric contract (MUL-2967)
Address PR review: the agent_task_* "Prometheus-only" banner claimed the old
PostHog event properties (task_id, agent_id, duration_ms, error_type,
will_retry, ...) were the metric label set. They are not — the real labels are
only source/runtime_mode/provider/terminal_status/failure_reason.
- Replace the agent_task_* sections with the actual metric names and labels
(multica_agent_task_*; see business.go / labels.go), and explain that
completed/failed/cancelled are terminal_status values on
multica_agent_task_terminal_total, with wall-clock in the *_seconds
histograms.
- Tighten the runtime_*/autopilot_run_* banners so id properties aren't
mistaken for labels.
- Drop the stale AgentTask allow-list reference from the pairing lint test
header comment.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(lark): use named import for react-qr-code to survive electron-vite interop
Clicking Bind on the agent detail page white-screened the desktop app at
the QR step:
Element type is invalid: expected a string or a class/function but got:
object. Check the render method of `LarkInstallDialog`.
react-qr-code is a CJS package. `import QRCode from "react-qr-code"`
relies on the bundler's __esModule default-interop to unwrap `.default`.
Next.js (web) unwraps it correctly; electron-vite's dep-optimizer handed
back the whole module namespace object `{ default, QRCode, __esModule }`
instead of the component, so React got an object where it expected a
component the moment <QRCode> mounted — desktop-only white screen, web
unaffected.
Switch to the named import `{ QRCode }`, which maps straight to
`exports.QRCode` and doesn't depend on the flaky default-interop path.
Resolves correctly under both bundlers; the package's own .d.ts exports
both the named class and the default, so it typechecks unchanged.
Not a backend / Lark-config issue — purely a frontend CJS interop bug.
* test(lark): expose named QRCode export in react-qr-code mock
Follow-up to the named-import switch in lark-tab. The test stubbed
react-qr-code with only a `default` export; now that the component
imports `{ QRCode }`, the named binding resolved to undefined and the
3 QR-rendering tests failed with "No QRCode export is defined on the
react-qr-code mock". Return the stub as both `QRCode` and `default`,
defined inside the factory (vi.mock is hoisted above top-level vars).
MULTICA_LARK_HTTP_ENABLED and MULTICA_LARK_WS_ENABLED were staging
knobs from the multi-PR rollout of the Lark MVP — they let the DB
schema + inbound dispatcher land before the HTTP wire was real, and
before the WS long-conn protocol was wired. Now that the MVP has
shipped end-to-end, "I set SECRET_KEY but I don't want to talk to
Lark" is not a useful production state: setting the at-rest master
key is the operator's opt-in for the integration as a whole.
Collapse the gate down to MULTICA_LARK_SECRET_KEY alone. When the
key is present, wire the real HTTPAPIClient + the real
WSLongConnConnector. CI / integration tests that want stub-style
behaviour can point MULTICA_LARK_HTTP_BASE_URL at a mock server
(already supported) instead of toggling a separate flag. Host
overrides (HTTP_BASE_URL, REGISTRATION_DOMAIN, CALLBACK_BASE_URL)
stay — those are real ops needs for international tenants / staging.
stubAPIClient + NoopConnectorFactory remain exported because the
test suite uses them directly; only the router boot path stops
reaching for them. The connector factory keeps its noop fallback
for the case where the endpoint fetcher fails to construct, so a
malformed MULTICA_LARK_CALLBACK_BASE_URL degrades gracefully
(visible as "connector=noop" in the boot log) instead of panicking
the server.
Lark integration + handler tests still pass; go vet clean.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Re-dispatching the same agent on the same issue reuses the persistent workdir
via execenv.Reuse(), where the standard-provider skill refresh re-wrote skills
without clearing the prior dispatch's output, so allocateCollisionFreeSkillDir
dodged Multica's own directories into issue-review-multica-N.
On reuse, reclaim the platform-owned managed skill directories the prior
manifest recorded (removeReusedManagedSkillDirs) and roll back the remaining
sidecar files (CleanupSidecars) before refreshing, so each skill lands at its
canonical slug every dispatch. Mirrors the Codex hydrateCodexSkills wipe;
scoped to reuse, which never runs for local_directory tasks.
Fixes#3684 (MUL-2963).
* feat(db): add Lark integration migration (MUL-2671)
Introduces seven tables for the 飞书 Bot integration MVP — per-agent
PersonalAgent installations, user/chat bindings, inbound dedup +
non-content drop audit, outbound card mapping, and short-lived
single-use member binding tokens.
Schema notes:
- chat_session schema unchanged; Lark routes through a separate
binding table rather than adding a metadata JSONB column.
- Outbound card mapping is task/message scoped so multiple runs on
the same session can't stomp each other's cards.
- lark_inbound_audit stores routing / identity / drop_reason ONLY,
never message body — the audit channel for unbound users and group
messages that don't address the Bot.
- app_secret stores ciphertext (encryption helper lands in a follow-up
commit on this branch); DB never sees plaintext.
Co-authored-by: multica-agent <github@multica.ai>
* feat(util): add secretbox AES-256-GCM helper for at-rest secrets
First consumer is lark_installation.app_secret (MUL-2671 §4.4), but
the helper is intentionally generic — future per-tenant secrets that
must not appear in a DB dump can reuse it.
Construction: AES-256-GCM with a per-message random nonce, providing
authenticated encryption. Tampered ciphertext fails Open instead of
silently decrypting to garbage. Master key loaded from a base64 env
var via LoadKey; key rotation is not in scope yet.
Co-authored-by: multica-agent <github@multica.ai>
* refactor(issues): extract IssueService.Create as single create entry (MUL-2671)
Establishes the service-layer boundary mandated by Elon's 二审 of
MUL-2671 §4.8: issue creation no longer lives inside the HTTP
handler. Both the HTTP POST /issues handler and the future Lark
/issue command call into service.IssueService.Create, so duplicate
guard, issue numbering, attachment linking, broadcast, analytics,
and agent/squad enqueue stay aligned.
Handler responsibilities shrink to parsing the HTTP request, doing
actor resolution / validation (transport-specific), and converting
service results into the IssueResponse + 201. The transaction-wrapped
core, attachment link, event publish, analytics capture, and
agent/squad enqueue all move into service.IssueService.Create.
A BroadcastPayload callback on the service keeps the WS broadcast
shape (the full IssueResponse) without forcing the service to
depend on handler-layer response types.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations): add Lark package skeleton (MUL-2671)
Establishes the architectural boundaries Elon's 二审 mandated as
first-PR blockers without dragging in OAuth, WebSocket, or
card-patching code (those land in follow-up PRs):
- ChatSessionService interface — channel-aware chat-session entry
point for Lark, deliberately separate from the HTTP SendChatMessage
handler. The HTTP handler's single-creator guard (creator_id ==
request user_id) is correct for the browser client but rejects
group chat_sessions by construction; Lark needs its own service.
- AuditLogger interface — the only path for recording dropped events.
Its signature deliberately omits message body, enforcing the
drop-audit policy (MUL-2671 §4.7) at the type level: unbound users
and non-addressed group messages can't accidentally end up in
chat_session.
- Typed IDs (OpenID, ChatID) prevent UUIDs from being conflated with
Lark-side identifiers at compile time.
- DropReason constants align dashboard/audit queries across callers.
Co-authored-by: multica-agent <github@multica.ai>
* refactor(issues): move parent/project workspace check into IssueService (MUL-2671)
Parent existence and project workspace membership now live inside
IssueService.Create, inside the same transaction as the duplicate guard
and counter increment. The HTTP handler stops re-implementing the
lookup; every future create entry (Lark /issue, MCP, API keys) inherits
the same boundary without copy-pasting the SQL.
Adds two error sentinels (ErrParentIssueNotFound, ErrProjectNotFound)
so transports can translate to their own error shapes. Handler-level
cross-workspace tests guard the boundary against future regressions.
Co-authored-by: multica-agent <github@multica.ai>
* fix(db): harden Lark migration safety底座 — TTL cap + workspace FK (MUL-2671)
Two storage-layer hardenings that move the must-fix line off "the app
layer enforces it" and onto the schema itself, so future write paths or
hand-inserted rows cannot regress the invariants.
1) lark_binding_token TTL cap. The DB CHECK was 1 hour as
defense-in-depth while the app constant was 15 minutes; the CHECK
now matches the product cap (15 minutes). Application constant
docstring updated to reflect that storage enforces the same bound.
2) lark_user_binding workspace membership. The table previously only
FK'd to workspace / user / installation independently, so a binding
could exist for a user no longer in the workspace, or claim a
workspace different from its installation's. Two composite FKs
close the gap structurally:
* (installation_id, workspace_id) → lark_installation(id, workspace_id)
— guarantees a binding's workspace_id always matches its
installation's workspace_id. A new UNIQUE (id, workspace_id) on
lark_installation is added as the FK target.
* (workspace_id, multica_user_id) → member(workspace_id, user_id)
with ON DELETE CASCADE — when a user is removed from the
workspace, the binding cascades away in the same transaction.
There is no longer a path where lark_user_binding outlives
workspace membership.
These two FKs are the schema-level proof for §4.3's "unbound or
non-workspace members cannot leak content into chat_session" invariant.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): inbound services + /issue dispatcher (MUL-2671)
Lands the inbound service layer for the Lark Bot MVP, sitting on top
of the migration + service-boundary scaffold from the previous commits.
What ships:
- sqlc queries for all seven lark_* tables (idempotent dedup insert,
CAS WS-lease, single-use binding-token consume, etc.) plus
GetMostRecentUserChatMessage for the /issue fallback.
- AuditLogger backed by lark_inbound_audit; signature deliberately
body-free so callers cannot leak content into the drop log.
- ChatSessionService: find-or-create chat_session via the binding
table (winner-takes-all on the UNIQUE race), append-with-dedup, /issue
parser, "previous user message" fallback for bare `/issue` invocation.
- Dispatcher orchestrates the inbound pipeline in one place:
installation routing → group-mention filter → identity check → ensure
session → append+dedup → /issue → enqueue chat task. Group sessions
use the installer as creator (stable workspace identity); p2p uses
the sender. Agent-offline path falls through with OutcomeAgentOffline
so the WS adapter can reply with the offline notice from §4.6.
- BindingTokenService: random URL-safe token, SHA-256 stored hash,
15-min TTL pinned at the application AND the DB CHECK; Redeem
returns the same opaque error for all rejection cases (no timing
oracle on replay).
- Unit tests for the parser (13 cases), dispatcher (8 cases via fake
Queries/Chat/Audit/IssueCreator/Enqueuer), and binding-token
hash/entropy. Real-DB integration tests for OAuth + token redeem
land alongside the HTTP handlers in the next commit.
Out of scope for this commit (next ones on the same feature branch):
OAuth callback, HTTP routes, WebSocket hub, outbound card patcher,
frontend.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): installation HTTP surface + secretbox-gated wiring (MUL-2671)
Lands the HTTP boundary on top of the inbound services from the
previous commit. What ships:
- InstallationService.Upsert: the only path that writes
lark_installation. Encrypts app_secret with the secretbox passed in
at construction time; refuses to fall back to plaintext storage
(returns an error from the constructor if no Box is supplied), so a
misconfigured dev environment cannot accidentally land a row with
cleartext credentials. Revoke flips status without DELETE so audit
trail survives.
- HTTP handlers under /api/workspaces/{id}/lark/:
* GET /installations — member-visible (Integrations tab
renders for non-admins). Soft 200
with empty list + configured:false
when MULTICA_LARK_SECRET_KEY is
unset, so the tab does not error
on self-host that has not opted in.
* POST /installations — admin-only; 503 when not configured.
Re-validates agent_id ∈ workspace
before accepting credentials so a
cross-workspace agent UUID is
rejected.
* DELETE /installations/{id} — admin-only; workspace-scoped lookup
so one workspace cannot revoke
another's installation by UUID
guess.
- POST /api/lark/binding/redeem (user-scoped, no workspace context):
the only path that mints a lark_user_binding row from user action.
Redeemer identity comes from the session, not the token, so a stolen
link cannot bind an open_id to an attacker's Multica user. The
composite FK on lark_user_binding cascades the binding away if the
user is not (or no longer) a workspace member, so a non-member who
steals the link gets 403 at the DB layer.
- Two new event-bus types in protocol.events:
EventLarkInstallationCreated, EventLarkInstallationRevoked.
- Router wiring: MULTICA_LARK_SECRET_KEY drives a conditional
initialization of h.LarkInstallations + h.LarkBindingTokens. When
unset, the integration disables itself with an INFO log and the
rest of the server boots normally.
- Handler tests cover all four not-configured short-circuits.
Happy-path integration tests (real DB, full create→list→revoke
cycle and token mint→redeem) ship alongside the WS hub PR.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): close binding-token rebind & typed task errors (MUL-2671)
Two must-fixes from PR review on HEAD 87ad15e1:
1. Binding-token redeem could be used to grab an already-bound Lark
open_id. Two changes harden the path:
- lark.sql `CreateLarkUserBinding` now gates ON CONFLICT DO UPDATE
on `multica_user_id = EXCLUDED.multica_user_id`, so a cross-user
rebind via a second valid token returns zero rows instead of
silently switching ownership.
- `BindingTokenService.RedeemAndBind` consumes the token and writes
the binding row inside one transaction. A failed bind no longer
burns the token; a successful bind never leaves a consumed-but-
unused token. Distinct typed errors: ErrBindingTokenInvalid (410),
ErrBindingAlreadyAssigned (409), ErrBindingNotWorkspaceMember
(403). The handler maps each to its own status code.
2. Dispatcher collapsed every `EnqueueChatTask` error to
`OutcomeAgentOffline`, hiding infra failure and misusing the
"offline" label for cases (e.g. archived agent) where it doesn't
fit. Now:
- `service.EnqueueChatTask` returns `ErrChatTaskAgentNoRuntime` and
`ErrChatTaskAgentArchived` as sentinel errors; DB / load / insert
failures stay wrapped as ordinary errors.
- Dispatcher uses `errors.Is` to map only the productizable cases
(`OutcomeAgentOffline`, new `OutcomeAgentArchived`); any other
error is returned to the WS adapter so it can retry or page
instead of disguising the outage as an offline card.
A daemon that's merely disconnected is still NOT an error — as long
as `agent.runtime_id` is set the chat task enqueues and waits for the
daemon to claim it on next online (returns `OutcomeIngested`).
Co-authored-by: multica-agent <github@multica.ai>
* ci: re-trigger workflow on lark MVP must-fix HEAD
Co-authored-by: multica-agent <github@multica.ai>
* ci: re-trigger workflow on lark MVP must-fix HEAD (retry)
Co-authored-by: multica-agent <github@multica.ai>
* test(integrations/lark): guard binding-token sentinel contract (MUL-2671)
Two unit tests that document and protect the must-fix invariants
without requiring a DB:
1. TestRedeemAndBindRequiresTxStarter — if a future refactor wires
up BindingTokenService without a TxStarter, RedeemAndBind must
fail fast with a clear error rather than nil-panic on Begin.
The atomicity contract (consume + bind commit together) depends
on that transaction existing.
2. TestBindingErrorSentinelsAreDistinct — the HTTP handler maps
ErrBindingTokenInvalid → 410, ErrBindingAlreadyAssigned → 409,
ErrBindingNotWorkspaceMember → 403. Accidentally aliasing them
(e.g. var ErrBindingAlreadyAssigned = ErrBindingTokenInvalid)
would silently regress the response codes without any other
test catching it.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): WS hub orchestrator + outbound card patcher (MUL-2671)
The hub owns one supervisor goroutine per active installation. Each
supervisor acquires the WS lease via the existing CAS query, runs an
EventConnector (interface — real Lark wire protocol lands in a follow-up
behind it), renews the lease on a tighter cadence than the TTL, and
backs off (with jitter) on connector failure. Lease loss tears the
connector down cleanly; revocation is reaped on the next sweep. Per-
process node id satisfies §4.4 multi-replica safety: at most one Hub
globally holds the lease for any installation.
The patcher subscribes to task / chat-done events on the existing
events.Bus and keeps the per-task Lark interactive card in sync
(thinking → streaming → final | error). Card binding is per-task as
required by §4.5; throttled patches via an in-memory last-patched map;
final / error transitions bypass the throttle so the user always sees
the terminal state. The Renderer is plug-replaceable so the product
card template can evolve without touching transport.
The APIClient interface centralizes the Lark Open Platform surface
this package needs (send card, patch card, send binding prompt,
exchange OAuth code). The default stubAPIClient returns
ErrAPIClientNotConfigured for every transport call so a misconfigured
deployment fails loudly instead of dropping cards silently. Real
implementation lands in a follow-up; OAuth callback + frontend entries
land in the next commits on this branch.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): OAuth install start / callback (MUL-2671)
OAuthService builds a signed-state Lark authorization URL the frontend
can render as a QR (or open directly), then on callback verifies the
HMAC-protected state, exchanges the OAuth code for installation
credentials via APIClient.ExchangeOAuthCode, and persists the row via
InstallationService.Upsert (which keeps app_secret encryption inside a
single chokepoint).
State token format: workspaceID.agentID.initiatorID.expiresUnix.nonce.sig
— HMAC-SHA256 over the first five fields with a deployment-level
secret. TTL defaults to 10 minutes (covered by tests). Three failure
modes (invalid state / expired state / missing code) map to typed
errors so the HTTP handler can emit a single lark_error= query param
the frontend uses to pick copy.
Both endpoints degrade cleanly: the at-rest key gate (already in place)
returns 503 from /install/start when the InstallationService is nil,
and the OAuth gate (MULTICA_LARK_OAUTH_APP_ID / _SECRET / _REDIRECT_URI
/ _STATE_SECRET) returns configured:false from /install/start so the
frontend can render "configure manually instead" without an error
banner. /install/callback always finishes with a redirect to
/settings?tab=lark carrying either lark_installed=1 or lark_error=<code>.
Tests cover signed-URL shape, missing-config rejection, tampered state,
expired state, propagated exchange error, and the no-config redirect
path on the HTTP handler.
Co-authored-by: multica-agent <github@multica.ai>
* feat(views/lark): settings tab + agent bind button + /lark/bind redemption page (MUL-2671)
Adds the user-facing Lark surface across the shared packages:
- packages/core/types/lark.ts — wire shapes that mirror server/internal/
handler/lark.go. Optional fields default to undefined so older desktop
builds keep parsing if the server adds new keys (CLAUDE.md → API
Response Compatibility).
- packages/core/lark/{queries,index}.ts — Tanstack Query options keyed
by workspace id; realtime sync invalidates `installations(wsId)` on
`lark_installation:*` events.
- packages/core/api/client.ts — listLarkInstallations,
getLarkInstallURL, deleteLarkInstallation, redeemLarkBindingToken.
- packages/views/settings/components/lark-tab.tsx — Settings → Lark
panel. Listing is member-visible (matches backend); disconnect is
admin-only. Empty state points users at the per-Agent bind entry,
matching the (workspace_id, agent_id) UNIQUE: there is no
"pick an agent" UI here because the bind URL is per-agent.
- LarkAgentBindButton (same file) is the per-Agent CTA the Agent
detail page imports. Opens the OAuth URL in a new tab; the callback
bounces back to /settings?tab=lark with a query param the panel
reads for inline confirmation copy.
- packages/views/lark/bind-page.tsx — the Bot's "you need to bind"
destination. Requires session before redeeming, distinguishes the
410/409/403 backend responses into distinct copy.
- apps/web/app/lark/bind/page.tsx — Next.js route wrapping the shared
bind page in a Suspense boundary (Next 15 useSearchParams rule).
i18n: all user-facing strings land in en/zh-Hans, settings tab nav
includes a Sparkles-iconed Lark entry, bind-page copy lives under
common.lark_bind so it works pre-workspace-context too. typecheck +
lint clean.
Co-authored-by: multica-agent <github@multica.ai>
* chore(integrations/lark): wire outbound Patcher into server bootstrap (MUL-2671)
Constructs the Patcher next to the existing Installation/BindingToken
wiring in router.go and Register()s it on the event bus. With the stub
APIClient any actual transport call surfaces ErrAPIClientNotConfigured;
once the real Lark client lands, swap NewStubAPIClient for the real
implementation here without touching the Patcher's subscription logic.
doc.go updated to reflect everything the package now contains (Hub,
Patcher, OAuthService, APIClient interface). The Hub itself is NOT
booted here yet — it needs an EventConnector implementation for the
Lark long-connection wire protocol, which lands in a follow-up; the
orchestrator code and its unit tests are in place so that follow-up
can focus on the WS protocol rather than lifecycle plumbing.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): address Elon 二审 5 must-fix items (MUL-2671)
- Hub: renewer cancels run ctx on lease loss so the connector exits
even if its wire I/O is blocked, keeping the §4.4 ownership
invariant intact under lease theft.
- Hub: EventEmitter returns (DispatchResult, error) so the real
connector can post the matching Lark-side card (needs_binding,
agent_offline, agent_archived) and react to infra failures instead
of silently logging at the seam.
- Dispatcher: top-level message_id dedup runs before group filter
and identity check, so a reconnect storm cannot re-fire binding
prompts or re-spam not_addressed_in_group audit rows; the in-
AppendUserMessage dedup is removed since the table-level UNIQUE
is the ultimate backstop.
- OAuth: HandleCallback auto-binds the installer via the new
InstallerBinder seam (BindingTokenService implements it), so the
§2.1 "scan to bind, you're done" promise holds end-to-end.
validateExchangeResult now requires installer open_id; new error
reason codes wired through the callback redirect.
- Frontend / handler: install_supported listing field + StartLark-
Install short-circuit on stub APIClient hide install entry points
(Settings tab + per-agent button) while no real Lark HTTP client
is wired, so users do not land in an OAuth flow that fails at
exchange.
Includes tests for each fix (lease-loss cancel, emit error
propagation, dedup ordering, OAuth installer-bind contract, stub-
client install gate) and i18n strings for the new preview state.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): two-phase dedup so infra failures do not swallow messages (MUL-2671)
The pre-fix top-level dedup wrote the lark_inbound_message_dedup row before
EnsureChatSession / AppendUserMessage. An infra error in either step left
the row in place and a WS-adapter retry was mis-classified as a duplicate,
so the user's Lark message was permanently lost without ever landing in
chat_session.
Make dedup two-phase:
- ClaimLarkInboundDedup acquires an in-flight claim (processed_at NULL).
Stale claims older than 60 s are re-takeable so a process crash does
not strand the message_id.
- MarkLarkInboundDedupProcessed flips processed_at on durable success
(audit row OR chat_message + session touch).
- ReleaseLarkInboundDedup deletes the in-flight row on infra failure
before any durable side effect, so the retry can re-claim immediately.
Dispatcher.Handle now finalizes the claim exactly once based on whether
the inner pipeline reached a durable outcome — chat_message commit being
the transition point (errors past it Mark, errors before it Release).
Regression tests cover the two failure variants Elon flagged plus the
inverse invariants (durable-error Marks, drops Mark, in-flight replays
drop, stale claims re-claim).
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): owner-fence dedup claim to close the double-write windows (MUL-2671)
The two-phase Claim/Mark/Release fix from the previous commit closed the
"infra error swallows a replay" gap but left two windows that could still
write a chat_message twice for the same Lark message_id:
1. Stale-reclaim race. Worker A claims at t=0, runs slowly past the
60 s staleness TTL but is still alive. Worker B sees the row as
stale and re-takes the claim. A reaches AppendUserMessage and
commits a second chat_message.
2. Mark window. Worker A commits chat_message but the post-pipeline
MarkLarkInboundDedupProcessed fails (DB hiccup) or the process
crashes before it runs. 60 s later a retry treats the in-flight
row as stale, re-claims it, and writes a second chat_message.
Close both with owner fencing + same-tx Mark:
- lark_inbound_message_dedup now carries a `claim_token` UUID;
ClaimLarkInboundDedup mints a fresh one on insert and on stale
re-take, so a reclaim ROTATES the token.
- MarkLarkInboundDedupProcessed and ReleaseLarkInboundDedup are
fenced on (message_id, claim_token, processed_at IS NULL) and
return rowsAffected. Zero means our token is no longer live, and
the caller treats it as a no-op (not an error).
- AppendUserMessage invokes MarkLarkInboundDedupProcessed INSIDE its
chat_message+session tx (qtx). If the token has been rotated by a
concurrent reclaim, the Mark matches zero rows and the method
returns ErrClaimLost; the deferred Rollback unwinds the
chat_message insert, so the other holder is the sole writer. The
durable write and the Mark therefore commit (or roll back)
atomically — there is no "committed but not yet Marked" window
for a crash or retry to exploit.
Dispatcher.processClaimed now returns a tri-state dedupFinalize directive
(none / mark / release): finalizeNone for the in-tx Mark path (and
ErrClaimLost), finalizeMark for audit-drop branches and the defensive
post-Append-success fallback, finalizeRelease for pre-durable infra
errors. ErrClaimLost is translated into OutcomeDropped + DropReason-
Duplicate at the Handle boundary, matching what the WS adapter expects
for a "another worker is the writer" outcome.
Regression tests:
- TestDispatcher_StaleReclaimRaceDoesNotDoubleWrite injects worker
B's reclaim via a beforeAppend hook so the claim_token rotates
between Claim and AppendUserMessage. Asserts worker A's
AppendUserMessage returns ErrClaimLost (no chat_message
committed), the dispatcher surfaces a duplicate drop, the token
rotated to a value distinct from A's original, and a follow-up
replay still duplicate-drops.
- TestDispatcher_InTxMarkPreventsPostCommitReclaim verifies the
"Mark window" case is unreachable: a successful in-tx Mark
produces exactly one Mark call (no post-finalize duplicate), the
row is terminal, and a retry with dedupReclaim=true still
duplicate-drops without re-rotating the token.
- TestDispatcher_InTxMarkSucceedsAndSkipsPostFinalize pins the
positive contract: DedupMarked=true must make applyFinalize a
no-op (no extra Mark, no Release).
fakeQueries gains a fakeDedupRow model carrying (processed, token,
rotations) so the test seam matches production's UPDATE-with-WHERE
semantics; fakeChat gains a beforeAppend hook to inject race timing.
go test ./... and go vet ./... pass.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): real Lark HTTP APIClient for IM v1 send/patch (MUL-2671)
Lands the production Lark Open Platform HTTP APIClient that replaces
the stub for outbound transport. The patcher's "thinking → streaming
→ final | error" card lifecycle and the dispatcher's binding-prompt
card both now reach Lark for real once MULTICA_LARK_HTTP_ENABLED=true.
Scope of this stage:
- tenant_access_token retrieval via /open-apis/auth/v3/
tenant_access_token/internal, cached in-process per app_id with a
60s safety margin against Lark's `expire` value. Sub-2-minute
expires are clamped to 120s so we never cache an entry that's
already past its safe window.
- SendInteractiveCard: POST /open-apis/im/v1/messages?receive_id_type=chat_id
returning the Lark message_id the Patcher persists in
lark_outbound_card_message for later patches.
- PatchInteractiveCard: PATCH /open-apis/im/v1/messages/:id with
the full re-rendered card body (Lark's update endpoint replaces,
not deep-merges).
- SendBindingPromptCard: open_id-targeted interactive card with a
primary "去绑定" CTA pointing at the redemption URL. Template is
co-located with the transport so the dispatcher never has to know
about Lark's card schema.
- Token-error invalidation: Lark codes 99991663 (expired) /
99991664 (invalid) drop the cached token so the next call
refreshes from /tenant_access_token/internal instead of looping
on a stale entry.
Out of scope (deferred to follow-up stages):
- ExchangeOAuthCode stays unimplemented behind
ErrAPIClientNotConfigured. The PersonalAgent install handshake's
response shape (returning per-installation app credentials in a
single call) is not yet verified against the production endpoint,
and a silent mis-fill of OAuthExchangeResult would corrupt
lark_installation rows past validateExchangeResult. Operators
continue to use the manual-paste InstallationService path until
the OAuth stage lands.
- Inbound WS EventConnector — Hub's ConnectorFactory still needs a
real wire-protocol implementation.
Wiring:
- MULTICA_LARK_HTTP_ENABLED=true switches router.go from the stub
to the real client. MULTICA_LARK_HTTP_BASE_URL overrides the
default open.feishu.cn host (set to open.larksuite.com for the
Lark international tenant, or to an httptest URL for integration
tests).
- The OAuth handler now also receives the real client (its
ExchangeOAuthCode still surfaces ErrAPIClientNotConfigured, so
callback behavior is unchanged until that stage lands).
Tests (19 new cases against an httptest.Server fake):
- happy path send/patch/binding-prompt round trips, asserting URL
query params, body shape, Authorization header
- token cache: 3 sends share one /tenant_access_token/internal hit
- token refresh after clock-driven expiry
- sub-margin expire clamping (10s expire → cached for >= safety
margin of wall-clock)
- Lark error code surfacing (230001 send, 230002 patch, 10003 auth)
- token-expired (99991663) invalidates the cache; caller's retry
re-fetches and succeeds
- non-2xx HTTP status surfaces "http 500: …"
- input validation: missing chat_id short-circuits BEFORE auth
round-trip, missing card json / open_id / bind url all fail
pre-flight without hitting Lark
- ExchangeOAuthCode still returns ErrAPIClientNotConfigured
- binding-prompt template carries the BindURL and the localized
"去绑定" CTA in valid JSON
go build ./..., go vet ./..., and go test ./internal/integrations/lark/...
pass. Pre-existing handler/router integration tests that require a
real Postgres connection are unaffected by this change.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): split outbound vs OAuth-install capability + card update_multi (MUL-2671)
Address Elon's two must-fix items from the HEAD a09993b1 review:
1. HTTP outbound and OAuth-install are now distinct APIClient
capabilities. The new SupportsOAuthInstall() reports whether the
install flow can succeed end-to-end (i.e. ExchangeOAuthCode is
implemented); the real httpAPIClient still returns IsConfigured()
= true (send / patch / binding prompt work) but
SupportsOAuthInstall() = false until the PersonalAgent install-time
response shape is pinned. Handler-side `install_supported` and
StartLarkInstall now gate on SupportsOAuthInstall, so a half-wired
client never reveals the scan-to-bind UI. larkOAuthErrorReason also
maps ErrAPIClientNotConfigured to a dedicated
`oauth_exchange_unimplemented` reason so a raw callback hit no
longer masquerades as `internal_error`.
2. defaultRenderer now emits config.update_multi=true on every Kind.
Lark refuses to apply PatchInteractiveCard to a card whose initial
config doesn't declare it shared/updatable, so the absent flag
would make every patch after the first send silently no-op on the
wire while the local outbound status row still flipped to
streaming/final.
Tests cover both halves of each fix:
- TestHTTPClient_SupportsOAuthInstall_FalseUntilExchangeLands +
TestHTTPClient_StubReportsBothCapabilitiesFalse pin the new
capability surface.
- TestStartLarkInstall_TransportOnlyClientReportsNotConfigured +
TestListLarkInstallations_TransportOnlyClientReportsInstallNotSupported
pin the handler gate at exactly the half-wired state.
- TestLarkOAuthErrorReason_APIClientNotConfigured pins the mapping
for both the bare sentinel and the fmt.Errorf-wrapped form
HandleCallback produces.
- TestDefaultRendererConfigCarriesUpdateMulti covers every CardKind.
- TestHTTPClient_(Send|Patch)InteractiveCard_DefaultRendererBodyHasUpdateMulti
verify the wire body Lark actually receives carries update_multi
through both send and patch transport paths.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): real OAuth code exchange + agent-detail bind entry (MUL-2671)
Stages the install side of the MVP critical path on top of the real
HTTP outbound work:
- httpAPIClient.ExchangeOAuthCode runs the production Lark v2 OAuth
flow: POST /authen/v2/oauth/token to swap the authorization code
for the installer's open_id, then GET /bot/v3/info under the parent
app's tenant_access_token to fetch bot_open_id. Result feeds
InstallationParams unchanged so OAuthService.HandleCallback's
auto-bind step lights up automatically.
- HTTPClientConfig gains OAuthAppID/OAuthAppSecret, read from the same
MULTICA_LARK_OAUTH_APP_ID/_APP_SECRET env vars the OAuthConfig
consumes. SupportsOAuthInstall now mirrors that pair so the install
capability gate is honest: outbound transport without OAuth creds
reports configured-but-not-install-supported, exactly like before.
- Agent detail inspector wires the LarkAgentBindButton in a new
Integrations section, viewer-hidden by canEdit. The button still
self-hides when SupportsOAuthInstall is false, so a deployment
without OAuth creds renders the section empty rather than CTA-broken.
- Capability wording cleaned across handler / router / lark-tab to say
"OAuth-install capability" instead of "real APIClient wired", and
the misleading TransportOnly... test was renamed/refocused on the
early-return branch it actually exercises (Elon non-blocking note).
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): identity-only OAuth + atomic bind (MUL-2671)
Addresses Elon's round-4 must-fix items on PR #3277:
1. OAuth v2 token → user_info chain now matches Lark's official
user-OAuth shape. `httpAPIClient.ExchangeOAuthCode` POSTs
/open-apis/authen/v2/oauth/token (RFC 6749: top-level
access_token, NO open_id), then GETs /open-apis/authen/v1/user_info
with the user_access_token as Bearer to obtain the installer's
open_id / union_id. The test fixture now reflects the real
wire shape (separate user_info handler; no synthetic open_id in
the token response).
2. `OAuthExchangeResult` is identity-only — drops the synthesized
shared-parent AppID / AppSecret / BotOpenID return that broke
the UNIQUE(app_id) constraint and the dispatcher's per-app_id
routing. `OAuthService.HandleCallback` no longer Upserts an
installation row: it looks up the lark_installation already
provisioned via the manual-paste POST /lark/installations route
and binds the installer onto it. Two new typed errors —
ErrInstallationNotProvisioned and ErrInstallationRevoked — map
to `installation_not_provisioned` / `installation_revoked`
reasons at the HTTP boundary so the UI can guide the admin.
The PersonalAgent install API (which would deliver
per-installation bot credentials at scan time) remains a
follow-up; until it lands the OAuth flow is identity-binding
only and the agent-detail bind button stays hidden on
deployments without OAuth env (capability gate unchanged).
3. The installation lookup + installer bind run inside a single
DB transaction so a concurrent revoke / re-provision between
the read and the binding insert cannot leak a half-applied
state. `InstallerBinder.BindInstaller` is renamed to
`BindInstallerTx` and accepts the OAuth-service-owned
transaction's qtx; the binding_token redemption path is
unchanged.
4. `validateExchangeResult` is simplified to require only the
installer's open_id; the obsolete ErrExchangeMissingAppID /
AppSecret / BotOpenID sentinels are removed (no caller can
trip them now). The oauth_test suite is rewritten to use a
stub failTxStarter so tests covering state-token verification
and exchange-error propagation remain DB-free, while a new
TestOAuthCallbackOpensTxAfterValidExchange pins the post-must-fix
order (state ok + exchange ok ⇒ Begin runs before any lookup
or bind, and a Begin failure aborts cleanly with no bind).
Verified locally:
- go build ./... / go vet ./... clean
- go test ./internal/integrations/lark/... ✓
- go test ./internal/handler -run 'Lark|Binding|OAuth' ✓
- go test ./internal/util/secretbox/... ./internal/service/... ✓
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): device-flow scan-to-install (MUL-2671)
Replaces the manual paste-credentials install path + identity-only
OAuth callback (rejected in product review: too many steps before a
user sees value) with a true single-step scan-to-install built on
Lark's RFC 8628 device-flow registration endpoint
(POST accounts.feishu.cn/oauth/v1/app/registration) — the same
protocol the official larksuite/oapi-sdk-go/scene/registration
package and zarazhangrui/feishu-claude-code-bridge use.
User journey: admin clicks "Bind to Lark" on the Agent detail page
→ QR dialog opens → admin scans in the Lark app on their phone →
authorizes the new PersonalAgent → dialog auto-closes with the new
installation visible. No app_id / app_secret to copy, no Lark
developer console visit, no Multica-side OAuth env to configure.
Backend (server/internal/integrations/lark):
- registration.go — inline ~280-line RFC 8628 client. Begin posts
archetype=PersonalAgent / auth_method=client_secret /
request_user_info=open_id; Poll follows the upstream SDK's
state machine including the tenant-brand mid-stream domain swap
to accounts.larksuite.com when a Lark-international account
authorizes. SDK is NOT vendored — one endpoint isn't worth
dragging the full oapi-sdk-go + transitive deps.
- registration_service.go — owns the in-process session store
+ background polling goroutine. On success calls APIClient.GetBotInfo
(the new IM-side endpoint added below) and writes
lark_installation + the installer's lark_user_binding inside
one DB transaction so a half-applied install can never land.
Stable error_reason codes (expired / access_denied /
lark_protocol_error / bot_info_failed / installation_conflict /
installer_bind_failed / internal_error) drive the UI copy
without parsing prose.
- client.go / http_client.go — drops ExchangeOAuthCode and
SupportsOAuthInstall (no longer applicable: device-flow returns
identity alongside credentials in one response); adds GetBotInfo
which mints a tenant_access_token from the freshly-minted
client_id / client_secret and calls /open-apis/bot/v3/info for
the bot_open_id. install_supported now gates on IsConfigured()
(real HTTP client wired) instead of a separate OAuth capability.
- binding_token.go — absorbs InstallerBindParams / InstallerBinder
(previously in oauth.go), retargets the doc-comment from the
OAuth caller to the device-flow caller.
- Deletes oauth.go + oauth_test.go entirely.
Handler & router (server/internal/handler, server/cmd/server):
- POST /api/workspaces/{id}/lark/install/begin — opens a new
registration session, returns {session_id, qr_code_url,
expires_in_seconds, poll_interval_seconds}. Admin-only.
- GET /api/workspaces/{id}/lark/install/{sessionId}/status —
polling endpoint, returns {status, installation_id?, error_reason?,
error_message?}. Workspace-scoped lookup so a stolen session_id
cannot be polled from another workspace. Admin-only.
- Removes POST /lark/installations (paste form),
GET /lark/install/start (OAuth-redirect entry), and
GET /api/lark/install/callback (OAuth redirect target).
- Removes MULTICA_LARK_OAUTH_APP_ID / _APP_SECRET / _REDIRECT_URI /
_STATE_SECRET / _AUTHORIZE_URL / _SUCCESS_URL env vars. Self-host
operators no longer need a parent Lark app at all.
Frontend (packages/core, packages/views):
- New types BeginLarkInstallResponse / LarkInstallStatusResponse
+ matching API methods (beginLarkInstall / getLarkInstallStatus);
drops getLarkInstallURL.
- LarkAgentBindButton opens LarkInstallDialog instead of a
window.open() to Lark's authorize page. The dialog uses
react-qr-code (catalog) to render the verification_uri_complete
inline as SVG (no external CDN image), polls status at the
server-supplied cadence, auto-closes on success, offers
"scan again" on terminal failure. Per CLAUDE.md "Enum drift
downgrades, not crashes", error_reason switch has a default
fallback so an older desktop build on a newer server still
renders the generic failure copy.
- Adds the device-flow strings to en + zh-Hans settings.json;
removes the obsolete OAuth-not-configured copy.
Verified locally:
- go build ./... / go vet ./... clean
- go test ./internal/integrations/lark/... — all green
(existing tests + 15 new registration / GetBotInfo tests)
- go test ./internal/handler -run 'Lark|Binding' — all green
- pnpm typecheck — all 6 packages clean
- pnpm lint — 0 errors (15 pre-existing warnings, none in changed files)
- pnpm --filter @multica/views test — 859/859 pass
Pre-existing failures in server/internal/middleware (column
"profile_description" missing from local test DB) reproduce against
the parent commit and are unrelated to this change.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): gate bind CTA to workspace admins, terminate QR polling on 4xx (MUL-2671)
Two frontend must-fixes from the PR #3277 二审:
1. LarkAgentBindButton now self-hides for non-admin viewers in addition
to the existing install_supported check. The agent-detail page mounts
the button under `canEdit`, which canEditAgent lets agent owners
through even when they are not workspace admins — but the backend
gates POST /lark/install/begin and the status poll on owner/admin
(router.go:478-487), so the previous behavior shipped a CTA that was
guaranteed to 403. The new gate reads workspace role from the same
member list the settings tab already uses.
2. The status polling loop now terminates on 404 (session gone — server
restarted, multi-instance routing, or in-process GC swept it) and
403/401 (permission revoked mid-session). Previously every error
path scheduled another setTimeout, which trapped the user on a stale
QR forever. ApiError gives us the HTTP status verbatim; terminal
responses set status=error with stable error_reason codes
(session_lost, forbidden) that flow through the existing dialog
switch + retry/close affordances. 5xx + network blips still retry.
i18n: new install_error_session_lost / install_error_forbidden in en
and zh-Hans, with default fallback preserved per the enum-drift rule.
Coverage: 6 new vitest cases — admin/owner allow, member deny,
unsupported-install deny, and the two terminal-error polling paths
using fake timers to assert the loop stops scheduling.
Also clears a handful of stale OAuth/manual-install doc comments
flagged in the review (non-blocker cleanup): doc.go's §10 now points
at RegistrationService, installation.go's input-shape doc loses the
OAuth-callback half, and client.go's stubAPIClient comments no longer
reference OAuth callbacks.
Co-authored-by: multica-agent <github@multica.ai>
* docs(integrations/lark): describe gate as device-flow install in agent-detail integrations comment (MUL-2671)
The comment block above the agent-detail Integrations section still
described the capability gate as 'server-side OAuth-install'. The
OAuth path is gone — install is now device-flow per RFC 8628 — so the
comment now reads 'server-side device-flow install capability gate'.
Pure comment change; behavior is unchanged. Cleans up the nit Elon
called out in PR #3277 二审 (MUL-2671).
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): wire inbound pipeline + WS Hub at boot (MUL-2671)
Stage 3.a of MUL-2671. Hub class, Dispatcher, ChatSessionService and
AuditLogger have all been implemented and tested in prior PRs but
none of them was constructed at boot, so the in-process plumbing
was never exercised end-to-end. This change wires them together
behind the same `MULTICA_LARK_SECRET_KEY` gate that already gates
InstallationService / RegistrationService, and starts the Hub under
the existing `sweepCtx` so it winds down alongside the other
long-running workers after HTTP drain.
The real long-conn EventConnector is still pending; the factory
hands every supervisor a shared NoopConnector that holds the lease
and emits nothing. That lets staging exercise the lease /
supervisor / shutdown lifecycle against real DB rows without
committing to the Lark wire protocol implementation. Swapping in
the real connector is a single line change in the same router
block; the Dispatcher / ChatSessionService / Hub seams stay frozen.
## Why a noop placeholder, not a stub-or-skip
The Hub's value is mostly its lifecycle: §4.4 ownership lease,
LeaseRenewInterval / LeaseTTL, supervisor reap on revoke, clean
release on shutdown. None of that runs unless the Hub is actually
started. Holding off until the real connector lands means the next
PR has to debut both pieces simultaneously; wiring the supervisor
loop first lets the real connector PR be a focused, reviewable
swap.
## Changes
- `internal/integrations/lark/noop_connector.go` — `NoopConnector`
implementing `EventConnector`: blocks on ctx until the Hub
cancels (lease loss / shutdown / revoke), emits no events, logs
on enter/exit so operators see exactly which installation the
supervisor is holding the lease for.
- `internal/integrations/lark/noop_connector_test.go` — verifies
the connector blocks until ctx cancel, returns nil on clean exit,
never invokes the emit callback, and the factory shares a single
connector instance across installations.
- `internal/handler/handler.go` — new `LarkHub *lark.Hub` field on
`Handler`. Nil when the Lark integration is disabled.
- `cmd/server/router.go` — inside the existing Lark wiring block,
construct `AuditLogger`, `ChatSessionService` (with `*pgxpool.Pool`
for the in-tx dedup Mark), `Dispatcher` (wiring `h.IssueService`
and `h.TaskService` so `/issue`-created issues share counter /
duplicate guard / project boundary / broadcast / analytics with
the rest of the product), and the `Hub` with the
`NoopConnectorFactory`. `NewRouterWithOptions` now returns
`(chi.Router, *handler.Handler)` so main.go can drive Hub
lifecycle; `NewRouter` discards the handler.
- `cmd/server/main.go` — start the Hub under `sweepCtx` after the
other background workers, and `Wait` on it after HTTP drain +
sweep cancel so the lease renewer can issue a final release
before exit. Skipped entirely when `h.LarkHub == nil`.
## Test plan
- [x] `go build ./...` clean
- [x] `go vet ./...` clean
- [x] `go test ./internal/integrations/lark/...` (new noop tests +
existing hub / dispatcher / chat_service / registration /
binding_token / outbound / issue_command suites) — all pass
- [x] `go test ./internal/handler -run 'TestLark|TestRedeemLarkBinding'`
pass — handler-side Lark surfaces unchanged
- [x] `go test ./internal/service/... ./internal/util/secretbox/...`
pass
- [x] `pnpm --filter @multica/views exec vitest run settings/components/lark-tab`
pass (6/6) — frontend lark surfaces unchanged
- [ ] Local broad `go test ./internal/handler/...` still blocked by
the pre-existing test DB schema drift Elon flagged in the
previous round (`column "metadata" does not exist`,
unrelated to this change); CI is the authoritative check.
- [ ] Manual end-to-end deferred until the real long-conn
EventConnector lands (next stage).
MUL-2671
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): bound Hub lease release + shutdown wait (MUL-2671)
Lease release used context.Background(); a stalled DB pool could pin
shutdown indefinitely. Add LeaseReleaseTimeout (5s default) and
ShutdownTimeout (15s default) to HubConfig, route releaseLease through
a bounded context, and expose WaitWithTimeout for main.go so a wedged
supervisor degrades to LeaseTTL expiry on the next replica instead of
blocking process exit. Also correct the LarkHub field comment in
handler.go: the Hub is wired whenever the at-rest secret key is set,
independent of whether the outbound HTTP APIClient is configured.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): real WS long-conn connector + ctx-cancel-breaks-read (MUL-2671)
Replaces NoopConnectorFactory with a production EventConnector that
opens Lark's event-subscription WebSocket. Gated behind
MULTICA_LARK_WS_ENABLED so staging boots stay on the noop path until
operators opt in, and falls back to noop with a warning when the WS
flag is set without MULTICA_LARK_HTTP_ENABLED (the real connector
needs the cached tenant_access_token).
Why this connector exists separately from the Hub: gorilla/websocket
ReadMessage blocks on the underlying TCP socket and does not observe
context. The watchdog goroutine inside WSLongConnConnector.Run closes
the conn the moment ctx fires, so lease loss / shutdown breaks the
blocking read in bounded time — exactly the invariant Hub
renewLeaseUntil's runCancel depends on for the "at most one active WS
per installation across replicas" guarantee. Tests cover this
explicitly (TestWSConnectorRunReturnsOnCtxCancelEvenWhenReadIsBlocked).
The Lark wire surface is split into three swappable seams so the
transport layer stays tested in isolation:
- EndpointFetcher (POST /event-subscription/v1/connection_token)
resolves a one-shot wss URL per Run. No caching — replaying a
one-shot token would look like a Lark outage.
- FrameDecoder turns one raw JSON envelope into an InboundMessage
or a "control / heartbeat / drop" verdict. Decoder errors log
+ drop the frame; they do NOT tear down the connection.
- CredentialsProvider wraps InstallationService.DecryptAppSecret
so plaintext app_secret lives in memory only during a Run.
Also fixes the handler.go LarkHub comment: it still said "joins on
Wait during graceful shutdown" but main.go has used WaitWithTimeout
(bounded wait) for several commits. Comment now matches.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): align WS to official binary Frame protocol + DispatchResult outbound replies (MUL-2671)
Two must-fix items from Elon's review of PR #3277:
1. WS protocol layer rewritten to match the official Lark Go SDK
(`larksuite/oapi-sdk-go/v3/ws`):
- Bootstrap is `POST /callback/ws/endpoint` with AppID/AppSecret
in the body (no tenant_access_token bearer). Response carries
wss URL + ClientConfig (PingInterval / ReconnectInterval /
ReconnectNonce / ReconnectCount).
- `service_id` is parsed from the wss URL query and used as
Frame.Service on every outbound frame.
- Wire envelope is the binary protobuf `pbbp2.Frame` (hand-rolled
via protowire to avoid pulling the whole SDK in, byte-identical
field tags). JSON payloads are nested inside Frame.Payload.
- Inbound data frames are ACKed with a `Response{code:200,...}`
JSON payload that reuses the inbound headers; infra failures
produce code=500 so Lark retries.
- Ping is the app-layer binary `NewPingFrame(serviceID)` at the
server-supplied cadence; WebSocket protocol PING is removed
(Lark ignores it). Server-initiated pings get a pong reply.
- ctx-cancel-breaks-read invariant preserved via the watchdog
goroutine that closes the conn on ctx.Done; the read loop and
ping goroutine serialize their writes through a single mutex.
2. `DispatchResult` outbound replies wired via a new `OutcomeReplier`:
- `OutcomeNeedsBinding` mints a one-shot binding token and sends
the binding prompt card to the sender's open_id.
- `OutcomeAgentOffline` / `OutcomeAgentArchived` push a notice
card into the chat with the agent name + Chinese copy matching
§4.6.
- `OutcomeIngested` stays owned by the Patcher; `OutcomeDropped`
is silent.
- The replier is best-effort: outbound failures are logged and
swallowed so a Lark outage cannot stall the inbound pipeline.
- Hub installs the noop replier by default; router wires the
production `LarkOutcomeReplier` when APIClient.IsConfigured().
PersonalAgent long-conn risk surfaced (open per Feishu docs:
`长连接模式仅支持企业自建应用`). The implementation works for any
app archetype; the open question is whether `/callback/ws/endpoint`
accepts PersonalAgent credentials in practice. Surfacing the Lark
code+msg verbatim from the bootstrap response so an operator running
the smoke test sees the exact failure rather than a generic timeout.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): byte-compat Frame marshal, chunk reassembly, ACK off reply critical path (MUL-2671)
Three protocol blockers from Elon's review of 9540008a:
1. Frame.Marshal is now byte-identical to oapi-sdk-go/v3/ws/pbbp2.Frame:
- SeqID/LogID/Service/Method (proto2 req) emit unconditionally even at zero
- PayloadEncoding/PayloadType/LogIDNew emit unconditionally per gogo
generated MarshalToSizedBuffer (no zero-guard)
- Payload uses the SDK's `!= nil` guard (nil omits, []byte{} emits 0-length)
- ACK payload JSON matches SDK's NewResponseByCode + json.Marshal output
({"code":N,"headers":null,"data":null})
Golden tests pin exact byte sequences for ping/pong/ACK/full/zero
frames; verified against the real SDK pbbp2.pb.go MarshalToSizedBuffer
producing identical bytes.
2. Multi-frame events (sum>1) are reassembled via the new chunkAssembler:
- 5s sliding TTL (matches SDK combine() cache TTL)
- Lazy GC on admit (no separate sweeper goroutine)
- Out-of-order seq + duplicate seq idempotent
- Partial chunks are NOT ACKed (SDK behaviour: only the final chunk's
ACK confirms the whole event so Lark can retry on partial loss)
- Connector wires assembler per-Run; state dies with the session
3. OutcomeReplier detached from ACK critical path:
- HubConfig.ReplyTimeout default 2.5s, strictly under Lark's 3s ACK deadline
- handleEvent dispatches synchronously (fast DB path), then spawns the
replier under a fresh background ctx with WithTimeout(ReplyTimeout)
- Hub.replyWg tracks in-flight replies; Hub.Wait / WaitWithTimeout
drain them so shutdown is bounded
- Noop replier short-circuits inline (no goroutine cost when outbound
APIClient isn't configured)
Proof tests:
- TestHubScheduleReplyReturnsImmediately: scheduleReply with a 10s
slow replier returns in <50ms
- TestHubReplyTimeoutCancelsHungReplier: hung replier ctx fires at
ReplyTimeout
- TestHubWaitDrainsInFlightReplies: Wait blocks until replies finish
- TestHubACKNotBlockedByOutboundReply: end-to-end through the
connector — data-frame ACK lands within 500ms even when the
replier hangs 5s
PersonalAgent real-env smoke remains Bohan's decision; this PR closes
the technical blockers Elon flagged.
Co-authored-by: multica-agent <github@multica.ai>
* docs(service/issue): narrow position concurrency claim to create-create (MUL-2671)
Elon's review of the merge resolution flagged that the comment on the
new NextTopPosition call promised more than the code guarantees:
concurrent manual reorder via UpdateIssue(position) does NOT take the
workspace row lock that IncrementIssueCounter holds, so a create
racing a reorder can still land on the same position. Rewrite the
comment to only claim create-create serialization, which is the
behaviour the lock actually delivers. No code change.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): keep device-flow polling on RFC 8628 HTTP 400 (MUL-2671)
Lark's device-flow polling endpoint returns HTTP 400 with the JSON
body `{"error":"authorization_pending"}` while the user hasn't scanned
the QR yet — this is the RFC 8628 spec, and the upstream oapi-sdk-go
implements the same handling. Our previous doForm treated ANY non-2xx
as a terminal protocol error, so every install session was killed by
the first poll (~5s after begin) and the install dialog appeared
silently empty: the frontend received status=error +
lark_protocol_error before the user could even read the description.
Fix: doForm now decodes the JSON body first; if it parses, the caller
(Begin / Poll) routes on the body's `error` field, where the existing
switch correctly maps authorization_pending / slow_down to "keep
polling" and access_denied / expired_token to terminal failure. Only
unparseable bodies (5xx HTML proxy pages, gateway timeouts) still
surface as a typed http_NNN RegistrationError.
Three regression tests pin the new behaviour:
- HTTP 400 + authorization_pending → res.Status="authorization_pending"
- HTTP 400 + access_denied → res.Err.Code="access_denied" (terminal)
- HTTP 502 + HTML body → http_502 RegistrationError
Verified against the live local env: install/begin -> 200, status
stays "pending" through the first poll cycle, no longer flips to
"error" within seconds.
Co-authored-by: multica-agent <github@multica.ai>
* fix(views/lark): reset closedRef on every mount so StrictMode double-mount renders QR (MUL-2671)
Empty QR dialog body in the dev env: Bohan opened the bind dialog and
got an empty white area where the QR should have been — no QR, no
"starting" placeholder, no error text. Backend was returning the QR
URL correctly; the bug was on the frontend.
Root cause: React 19 / Next.js dev StrictMode mounts every component
twice (mount → cleanup → mount). The component instance is REUSED
across the simulated remount, which means useRef objects are
preserved. The dialog's `closedRef` lifecycle:
1. Mount #1: closedRef={current:false}, beginSession() kicked off
(HTTP request still in flight)
2. Cleanup runs: closedRef.current=true
3. Mount #2: beginSession() kicked off again, BUT the ref still
reads {current:true} from step 2
4. Both promises resolve. Both hit the post-await guard
`if (closedRef.current) return;` and bail out before setSession().
5. Result: session stays null forever. Every conditional in the
dialog body (beginning/session-pending/success/error) is false →
empty body.
Fix: reset closedRef.current=false at the START of the effect, not
just at component construction. The cleanup-then-mount pair now
re-arms the guard so subsequent setSession calls actually land.
Regression test wraps the dialog in <StrictMode> and asserts the
QR appears within 2s with the correct value — fails closed if anyone
removes the reset.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): drop EventTaskCompleted subscription so the chat reply doesn't get overwritten by "Done." (MUL-2671)
Bohan reproduced on the live dev env: agent replies show only a card
saying "Done." in Lark, even though Multica's own chat panel has the
real "Hello! I'm cc…" reply. Tasks succeed end-to-end, but the user
loses the reply on the Lark side.
Root cause: TaskService.CompleteTask publishes two events for every
chat task IN ORDER:
1. broadcastChatDone(...) → ChatDonePayload{Content: "Hello!..."}
2. broadcastTaskEvent(Completed) → map[string]any{task_id, agent_id,...}
(no `content` key)
The Patcher subscribed to BOTH and routed each to finalize(). The
first patch correctly rendered the reply text, the second
patched the same card with an empty payload — chatDoneContent()
returned "" and the renderer fell back to "Done." (default empty-body
copy). The second patch wins because Lark stores whatever was last
applied.
Fix: stop subscribing to EventTaskCompleted in the Patcher and remove
the corresponding switch arm. EventChatDone is the canonical "agent
finished replying" signal for the Lark card path; EventTaskCompleted
is still emitted to the bus for other listeners (web UI, analytics,
task usage) where the lack of content doesn't matter.
Regression test TestPatcherIgnoresEventTaskCompletedForChatTasks
emits ChatDone followed by TaskCompleted on a streaming card and
asserts: exactly one patch, body contains the agent reply, body does
NOT contain "Done.". If anyone re-adds the EventTaskCompleted
subscription, this fails immediately.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): chat replies as plain text IM messages, not card chrome (MUL-2671)
Bohan reported on the live dev env that even with the agent's reply
shown correctly, every message is wrapped in an interactive card with
the agent name as the header — it feels like a system notification,
not a normal chat reply. He wants the reply to land as a regular Lark
text bubble.
Changes:
- Add APIClient.SendTextMessage backed by Lark's
/open-apis/im/v1/messages with msg_type=text. JSON-encodes the
{"text": ...} envelope Lark requires so callers pass raw strings.
- Patcher.Register no longer subscribes to EventTaskQueued /
EventTaskRunning. There is no more thinking → running → final
card lifecycle on the success path: it added card chrome without
buying anything for free-form chat.
- On EventChatDone, the new sendChatReply path posts the assistant
message content as plain text. Empty content is silently dropped
rather than rendered as "Done." (the prior fallback that
confused Bohan).
- Failure path keeps a one-shot error card on EventTaskFailed —
the visual distinction from a normal reply is genuinely useful,
and failures are rare enough that the chrome isn't noisy.
- Throttle / lastPatched map / MinPatchInterval / shouldPatch /
markPatched / loadCardOrSkip are all removed; nothing in the new
flow patches.
Tests:
- TestPatcherSendsPlainTextOnChatDone pins the new contract: exactly
one SendTextMessage call, no card sends or patches, content
matches the ChatDonePayload.
- TestPatcherDropsEmptyChatReply pins the "no more Done. fallback"
decision — empty content drops, period.
- TestPatcherFailEventSendsErrorCard pins the failure path still
uses a card (one-shot, no patching).
- TestPatcherIgnoresEventTaskCompletedForChatTasks rewritten for
text path: ChatDone then TaskCompleted yields exactly one text
send, no duplicate.
- TestPatcherSkipsWhenNoChatSessionBinding and
TestPatcherSwallowsInstallationLoadErrors rewritten to drive
EventChatDone (the new entry point) instead of TaskQueued.
- TestPatcherSendsThinkingCardOnTaskQueued deleted (no more
thinking card).
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): pre-fill PersonalAgent bot name as "<agent> - Multica" (MUL-2823) (#3520)
The device-flow install left the bot at Lark's auto-generated
"{用户姓名}的智能助手". Lark's registration scene supports pre-filling the
name via a `name` query param on the verification/QR URL (mirrors the
upstream SDK's AppPreset.Name) — a user-editable default that rides on
the QR URL, not the begin POST body (which has no name field).
BeginInstall already loads the agent for its ownership check, so we keep
it and thread `<agent.Name> - Multica` through Begin → decorateQRCodeURL.
A blank name degrades to plain "Multica".
There is no post-install rename API (bot/v3 is read-only; no
bot/v3/update), so the install-time pre-fill is the only programmatic
lever; the user can still edit the name on the creation form.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): restore /issue confirmation + pin SendTextMessage wire (MUL-2671)
Two recovered/added contracts off Trump's review of HEAD fe381a07:
1) /issue confirmation in Lark was a casualty of the plain-text
refactor. The pre-refactor `RenderInput.IssueNumber` field was
declared but never actually rendered into the card body, so even
in the original card-based flow the user never saw a "Created
[MUL-42]" confirmation. Now the OutcomeReplier handles
OutcomeIngested + IssueID.Valid by sending a plain text message:
Created MUL-42 — fix login bug
https://multica.example/issues/MUL-42
Composed from a new DispatchResult.IssueIdentifier +
IssueTitle, populated by the Dispatcher from
workspace.IssuePrefix + issue.Number / issue.Title. Workspace
lookup is best-effort: a Postgres blip on workspace gets a "#42"
fallback rather than silently dropping the confirmation.
The agent's own chat reply (if any) continues to land separately
via the Patcher on EventChatDone — these are two semantically
distinct messages and the user benefits from seeing both.
2) SendTextMessage is the wire layer Trump flagged for missing
coverage. Three new wire tests pin:
- happy path: POST /open-apis/im/v1/messages?receive_id_type=chat_id,
msg_type=text, Bearer <tenant_access_token>, double-JSON
content envelope
- special-character round trip: newlines, double quotes,
backslashes, tabs, Chinese + emoji, JSON-lookalike strings.
The inner {"text": ...} is encoded once at JSON.Marshal time
and once again when the outer body serializes; losing either
pass corrupts the message and the bug is invisible without a
contract pin.
- Lark error path: non-zero `code` surfaces as a wrapped error
with the code embedded.
Tests:
- TestDispatcher_IssueCreationFromCommand asserts IssueIdentifier
("MUL-42") and IssueTitle propagate through DispatchResult.
- TestDispatcher_IssueIdentifierFallsBackToNumberOnWorkspaceLookupErr
pins the "#7" degrade-graceful fallback.
- TestLarkOutcomeReplierIssueCreatedSendsConfirmation pins the
text body (identifier + title + deep link) and asserts no card
send on this path.
- TestLarkOutcomeReplierOutcomeIngestedSilentWithoutIssue pins
the silent-on-plain-chat default so we don't accidentally start
emitting a confirmation for every message.
- TestHTTPClient_SendTextMessage_* covers the wire contract.
Frontend locale parity (en + zh-Hans, 53 tests) is currently green
on this HEAD; no changes needed.
Co-authored-by: multica-agent <github@multica.ai>
* fix(views/locales): add missing ko keys for Lark MVP (MUL-2671)
Trump flagged on PR #3277 review that the ko bundle was missing the
Lark-MVP-only keys that en + zh-Hans both carry. The parity test
caught it cleanly after main was merged in (Korean PR landed on main
between the prior review and this one):
common.lark_bind.* (13 keys)
settings.page.tabs.lark (1 key)
settings.lark.* (45 keys)
agents.inspector.section_integrations (1 key)
Korean translations are professional/concise — "Lark" stays as the
brand name (matches how en keeps "Lark" + "(飞书)" parenthetically;
ko/users searching for the product expect "Lark"), and product copy
follows the zh-Hans tone where Multica nouns ("에이전트", "워크스페이스")
are romanized loan words consistent with the rest of the ko bundle.
Slot ordering preserved against EN:
- page.tabs.lark sits between github and integrations
- inspector.section_integrations sits right after section_skills
Verified: pnpm exec vitest run locales/parity → 105/105 pass.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): /issue origin_type CHECK + Hub restart on credentials rotation (MUL-2671)
Two live-env bugs Bohan reproduced:
1) /issue command crashed the WS connector. Dispatcher writes
origin_type='lark_chat' on issues born from `/issue`, but the
issue_origin_type_check CHECK constraint was last extended in
migration 060 for quick_create — it doesn't list lark_chat, so
every Lark /issue tripped SQLSTATE 23514 and bubbled up as an
infra error. The infra error tore down the WS connector, Lark
retried the same message, the new connector tripped the same
constraint and crashed again. Repro in the live env: three
crashes from the same /issue event over ~40s, each leaving the
user with no confirmation in Lark.
Migration 111 extends the CHECK list:
CHECK (origin_type IN ('autopilot', 'quick_create', 'lark_chat'))
2) Re-scanning an already-bound agent silenced the bot. The device
flow re-registers with Lark, which mints a brand-new bot (fresh
app_id + app_secret); RegistrationService.finishSuccess upserts
into lark_installation by agent_id, so the row's credentials
rotate in place. But the running supervisor held the OLD inst
struct by value and kept a WS open against the OLD bot's app_id —
so all events to the NEW bot went nowhere. Bohan's "claude code
现在不能在飞书里回复了" symptom maps exactly to this:
log timeline:
16:29:57 cc connector connected with app_id=cli_aa9398dd... (OLD)
16:34:07 lark registration: install complete (rotation)
→ row.app_id is now cli_aa93f36f... (NEW)
→ old WS still subscribed to OLD app_id; new app_id receives nothing
Fix: Hub.sweep now compares each installation row's credentials
fingerprint (app_id + bot_open_id + sha256(app_secret_encrypted))
against the snapshot the running supervisor was started with. On
diff, cancel the old supervisor and start a fresh one inline. A
monotonic gen counter on the supervisor entry disambiguates the
old goroutine's deferred cleanup from the new entry the rotation
path already swapped in.
Tests:
- TestHubRestartsSupervisorOnCredentialsRotation pins the new path:
starts hub on app_one, rotates the row to app_two, asserts the
connector factory is called again with the fresh AppID.
- TestHubDoesNotRestartSupervisorOnUnchangedRow pins the negative
case so an unchanged row doesn't degenerate into a per-sweep
busy-loop.
- Existing hub tests (lease, supervise, shutdown, ACK timing,
noop replier) all green.
Verification:
- go test ./internal/integrations/lark/... -race -count=1 ok
- go build ./... clean
- migration applied locally; \d+ issue confirms lark_chat in CHECK
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): per-supervisor lease token to fence rotation handoff (MUL-2671)
Elon flagged a race in HEAD be8d4cef's rotation path: both the old
and the new supervisors of the same Hub used the hub-wide nodeID as
their WS lease token, so an old supervisor's post-cancel
releaseLease(nodeID) would CAS-match the lease row the successor had
just acquired with the SAME token and DELETE it. Symptom would be a
silently empty lease row a few hundred ms after every device-flow
re-scan — no replica owning the install, no events delivered, the
"bot goes quiet" pattern Bohan hit the first time but now from the
fencing side rather than the credentials side.
Fix: leaseToken(nodeID, gen) composes "<nodeID>-g<gen>", where gen is
the monotonic counter already attached to each supervisorEntry. The
nodeID prefix keeps cross-replica observability (an operator
inspecting lark_installation.ws_lease_token can still map back to a
process) while the -g suffix makes the OLD supervisor's release
target the OLD row state. Once the rotation path swaps in the new
supervisor, the row's CurrentToken is the new -g(N+1) token, so the
old -gN release's WHERE clause no-ops instead of clobbering.
acquireLease / renewLeaseUntil / releaseLease now take an explicit
token argument; supervise threads its leaseToken through. The
plumbing isn't pretty, but having an explicit argument at every call
site is the only way the rotation invariant survives subsequent
refactors — without it, a future caller could quietly reintroduce
"just use h.nodeID" and the race is back.
Two regression tests:
- TestHubRotationStaleReleaseDoesNotClearSuccessorLease drives the
fake lease state machine directly:
1. old acquires(tokenA)
2. rotation lands; new acquires(tokenB)
3. old's stale release(tokenA) fires
Asserts owner ends up still tokenB. Hub-wide-nodeID code would fail
step 3 by clearing the entry.
- TestHubRotationEndToEndKeepsSuccessorLeased runs the same scenario
through the live supervise loop: starts hub, rotates the row, waits
for sup2 to take over with a distinct token, sleeps past sup1's
unwind, asserts the row is still held by a non-sup1 token. Catches
the bug even when the goroutine timing is non-deterministic.
Verification: go test ./internal/integrations/lark/... -race -count=1 ok
go build ./... clean
go vet ./... clean
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): route group @-mentions via union_id, not open_id (MUL-2671)
In a Lark group with multiple Multica bots installed, the bot whose WS
received the event sometimes failed to recognize that it was the @-target
while the OTHER bot's supervisor falsely fired. Bohan's controlled three-
message test (only @A, only @B, @both) hit this: @A and @B alone went
unanswered, @both got picked up by A only.
Root cause: the `mentions[].id.open_id` field Lark puts on the WS event
is structurally INVERSE to `/bot/v3/info`'s `bot.open_id` across the two
WSes. From A's WS perspective, the wire-form open_id for "A was @-ed"
is NOT equal to A's API-side open_id, but IS equal to what B's WS sees
on its side, and vice versa. The decoder's `mention.open_id ==
inst.BotOpenID` match therefore fires on the wrong bot in multi-bot
groups. Only `union_id` (the Lark-tenant-scoped stable identifier) is
consistent across both WSes.
Changes:
- migration 112 adds nullable `lark_installation.bot_union_id`
- sqlc query exposes UpsertLarkInstallation/CreateLarkInstallation
with bot_union_id, plus a focused SetLarkInstallationBotUnionID for
the backfill path
- httpAPIClient.GetBotInfo now follows /bot/v3/info with /contact/v3/
users/{open_id}?user_id_type=open_id and returns both identifiers
on BotInfo. Soft-fails on contact-scope denial: install still
succeeds with an empty UnionID, and the decoder falls back to the
legacy open_id match for single-bot deployments.
- RegistrationService.finishSuccess persists union_id alongside
open_id during the device-flow finalize.
- ws_frame_decoder.containsMention prefers union_id and only walks
open_id when the installation row has not been backfilled yet.
- BackfillBotUnionIDs runs once at server boot for installations
created before migration 112; bounded per-row 10s timeout and a
pure soft-fail policy so a slow Lark round-trip cannot block
startup.
- regression tests cover the three decoder paths: union_id match
wins over open_id mismatch, union_id mismatch overrides open_id
match, and open_id fallback when union_id is unknown.
Co-authored-by: multica-agent <github@multica.ai>
* chore: drop trailing blank lines at EOF on four files (MUL-2671)
git diff --check origin/main..origin/pr-3277 flagged these as new
blank lines at EOF; clearing so the diff stays clean for review.
Co-authored-by: multica-agent <github@multica.ai>
* fix(views/locales): add missing ja keys for Lark MVP + section_integrations (MUL-2671)
CI frontend job tripped on the ja locale parity check: ja is missing
the lark_bind block in common.json, the lark block + page.tabs.lark
in settings.json, and inspector.section_integrations in agents.json.
The ko fix earlier covered Korean; ja was added separately on main
and the merge surfaced these gaps. Translations mirror the en source
and follow the same voice as the existing ja bundle.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): rewrite @_user_N placeholders into clean body (MUL-2671)
When Lark dispatches a group `im.message.receive_v1`, the message
text contains opaque `@_user_1`, `@_user_2`, … placeholders and the
real identity is in `mentions[]`. We were forwarding the raw text to
the agent, so a Bohan-typed "@Bot ping test" arrived as "@_user_1
ping test" — neither human-readable nor useful as LLM context, and
the agent was paying tokens to figure out which `@_user_N` was even
itself.
The new resolveMentions pass:
* strips the bot's own mention entirely (the dispatcher already
routes the event on AddressedToBot; re-emitting @<self> in front
of every message adds zero signal and pollutes context),
* substitutes other participants with `@<displayName>` so a
follow-up "@Alice" reads naturally,
* collapses horizontal whitespace introduced by the strip while
preserving original newlines.
Bot identity check uses the same union_id-preferred + open_id
fallback as containsMention, so the rewrite stays consistent with
the routing path. Tests cover the four shapes: bot self-mention,
mixed bot + other-user mention, multi-line body with stripped
mention, and a no-mention body that should be left untouched.
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): union_id-first self mention strip + token-aware scan + local whitespace cleanup (MUL-2671)
Three review blockers on the mention rewrite from PR review:
1. isBotMention now mirrors containsMention's union_id-first policy.
When the installation row knows our union_id, we trust it
exclusively (open_id is structurally inverted in multi-bot
groups — matching on it would re-introduce the routing bug we
fixed two commits ago). open_id fallback fires only when
union_id is absent. New tests: @-ing both bots in one message
correctly strips only self and renders the sibling as @<name>;
open_id-matches-but-union_id-differs does NOT strip.
2. resolveMentions no longer collapses or trims whitespace globally.
Indentation, tabs, code blocks, tables — all preserved verbatim.
When the self mention is removed we eat exactly one adjacent
horizontal space (the one after the placeholder, or, when the
mention sits at end-of-input, a single space already emitted
right before it). New test exercises a multi-line indented +
tabbed body and asserts the whole shape survives.
3. Prefix-collision-safe replacement. A chat with 11+ participants
exposes both `@_user_1` and `@_user_10`; naive ReplaceAll for
`@_user_1` would mangle the substring of `@_user_10`. The
resolver now does a single-pass token scan with the mention
list sorted longest-key-first, so the longer placeholder always
wins at any scan position. New test covers the @_user_1 /
@_user_10 case explicitly.
Also drops the temporary INFO-level diag logging the previous
commit added — root cause was confirmed (union_id swap in the
manual backfill; not a decoder bug).
Co-authored-by: multica-agent <github@multica.ai>
* fix(integrations/lark): scope inbound dedup per (installation_id, message_id) (MUL-2671)
Root cause of the residual "@Cc gets dropped as not_addressed_in_group"
even after the union_id swap landed: lark_inbound_message_dedup was
keyed on `message_id` alone. In a Lark group chat where the workspace
has multiple Multica bots installed, Lark delivers the SAME message_id
to every bot's WS supervisor. Whichever WS claimed first then ran its
own AddressedToBot check; the bot that was actually @-ed lost the dedup
race, found the row already terminal (`processed_at IS NOT NULL`), and
was dropped as `duplicate` BEFORE it could evaluate its own mention.
Net: every @ silently disappeared if Lark happened to route the OTHER
bot's WS first.
The dedup gate's original purpose (idempotency against WS reconnect
replay) is per-installation by definition, so the right key is
composite (installation_id, message_id).
Changes:
- migration 113 drops + recreates lark_inbound_message_dedup with
installation_id NOT NULL REFERENCES lark_installation(id) ON DELETE
CASCADE and PRIMARY KEY (installation_id, message_id). The table is
a 24h transient cache, so dropping existing rows is safe.
- sqlc queries: ClaimLarkInboundDedup / MarkLarkInboundDedupProcessed /
ReleaseLarkInboundDedup all now take installation_id.
- AppendUserMessageParams carries InstallationID through to the
in-tx Mark call so the chat_message+dedup atomicity stays intact.
- Dispatcher passes inst.ID to claim + applyFinalize + AppendUserMessage.
- Test fakes key dedup state on (installation_id, message_id) via a
composite map key; all existing pre-seeded rows use a seedDedupKey
helper bound to the default activeInstallation fixture so the prior
staleness / token-rotation / in-tx mark tests still exercise the
same regression they did before.
- New regression TestDispatcher_DedupIsScopedPerInstallation pins the
multi-bot invariant: a row pre-seeded for installation A does NOT
block installation B's first delivery of the same message_id; B
runs through its own group-filter / identity / ingest pipeline.
Co-authored-by: multica-agent <github@multica.ai>
* feat(integrations/lark): render markdown chat replies via schema-2.0 card (MUL-2671)
The agent's chat replies were going out as msg_type=text, so every
`**bold**`, fenced code block, list, table, and link in the body
showed up as literal markdown characters in Lark — the user saw raw
asterisks, hashes, pipes instead of formatted text. Bohan reported
this and pointed at zarazhangrui/lark-coding-agent-bridge as the
shape to emulate.
The bridge repo uses Lark interactive cards with the schema-2.0
envelope and a `tag: "markdown"` body element; Lark's client
renders that to formatted text (GFM-ish: bold/italic, headings,
lists, links, fenced code blocks, tables, blockquotes). They expose
multiple reply modes (card / markdown-as-post / text) gated by user
config; we go a step simpler — auto-detect markdown syntax in the
agent's body and route accordingly:
- containsMarkdown(): cheap substring + regex pass for fenced code
blocks, headings, list markers, bold/italic, tables, links,
blockquotes, horizontal rules, inline code. Biases toward false-
positive — wrapping prose in a card still renders fine, but
missing a real markdown block leaves raw characters visible.
- APIClient gains SendMarkdownCard / SendMarkdownCardParams.
Implementation marshals the schema-2.0 envelope verbatim:
{schema:"2.0", body:{elements:[{tag:"markdown", content: md}]}}.
Stub returns ErrAPIClientNotConfigured.
- Patcher.sendChatReply now branches on containsMarkdown:
markdown → SendMarkdownCard, plain prose → SendTextMessage. A
one-liner "sure, on it" stays as a normal IM bubble (no card
chrome); anything with markdown gets the rendered card.
Tests: TestContainsMarkdown pins the heuristic across plain prose
and ten markdown shapes; TestPatcherRoutesMarkdownReplyToCard and
TestPatcherRoutesPlainReplyToText cover the router; new HTTP wire
test TestHTTPClient_SendMarkdownCard_HappyPath contract-pins the
card envelope (msg_type=interactive, schema 2.0, markdown tag,
verbatim body). Full lark suite passes.
Co-authored-by: multica-agent <github@multica.ai>
* fix(service/issue): route analytics.IssueCreated through obsmetrics.RecordEvent (MUL-2671)
CI's TestNoNakedAnalyticsCaptureInHandlersOrServices guard caught the
post-merge analytics call in IssueService.captureCreatedAnalytics
that still used s.Analytics.Capture(...) directly. Main added that
lint to prevent the Prometheus and PostHog sides from drifting — any
new analytics.* event must go through obsmetrics.RecordEvent so the
business-metrics collector and the PostHog client fire from the same
call site.
Fix mirrors how TaskService handles it: IssueService gains a
Metrics *obsmetrics.BusinessMetrics field (router wires it via
h.IssueService.Metrics = opts.BusinessMetrics next to the existing
TaskService line), and the in-service Capture call becomes
obsmetrics.RecordEvent(s.Analytics, s.Metrics, ...). nil-safe by
construction — RecordEvent treats a nil Metrics as PostHog-only.
Co-authored-by: multica-agent <github@multica.ai>
* feat(views/lark): swap Bind CTA for Connected+Manage link when agent already has an installation (MUL-2671)
Bohan reported the agent-detail Bind button keeps inviting the user to
re-scan the QR even when the agent already has an active Lark
PersonalAgent connected — and re-scanning silently upserts the
installation row, leaving the previously-created Lark bot dangling
as a zombie. Frustrating UX and an actual product footgun.
Anti-zombie guard at the only entry point: LarkAgentBindButton now
checks the cached installations listing for an active row pinned to
this agent_id. When one exists, the install CTA is gone — replaced
by a small Connected pill + an "Manage in Lark" link that opens the
Bot's app page in Lark's developer console (open.feishu.cn/app/<app_id>)
in a new tab. That's where scopes, display name, and additional
permission requests actually live; re-scanning never was the right
answer for managing an existing bot.
Scoping is per-agent: an active installation on a DIFFERENT agent
in the same workspace doesn't affect this agent's button, and a
revoked installation falls back to the bind CTA so the user can
re-create. Tests cover all four states (own-active / own-revoked /
other-agent-active / no-installation) and pin the Manage link's
href + target=_blank + noopener.
i18n: three new keys in settings.json (en / zh-Hans / ja / ko):
agent_bot_connected_label, agent_bot_manage_link,
agent_bot_manage_tooltip. Locale parity test still 157/157.
The dev console host is hardcoded to open.feishu.cn — operators
on the Lark international tenant currently get the wrong host;
future-proof fix wants the backend to surface a per-installation
dev_console_url on the listings response, called out in a code
comment.
Co-authored-by: multica-agent <github@multica.ai>
* feat(views/settings): collapse Lark into Integrations + render agent identity (MUL-2671)
Lark was its own top-level workspace settings tab while Integrations sat
empty next to it. As more integrations land, the sidebar would balloon
with one tab per provider. Move the Lark surface into Integrations as
the first hosted integration; the old ?tab=lark URL redirects through
LEGACY_WORKSPACE_TAB_REDIRECTS so bookmarks still resolve.
The Connected bots list was leaking the raw Lark app_id (cli_…) as the
row title with bot_open_id (ou_…) underneath — meaningless to product
users. Since the binding is 1:1 with a Multica Agent, join on agent_id
and render the agent's avatar + name via the workspace-standard
ActorAvatar + useActorName.getAgentName. Deleted agents fall back to
"Unknown Agent" so the row is still actionable for cleanup.
Tests: stub useActorName + ActorAvatar in lark-tab.test.tsx and add
LarkTab connected-bot tests covering the agent identity render and the
deleted-agent fallback. Drop the now-dead integrations.* + page.tabs.lark
+ lark.bot_open_id_label keys across all four locales — parity still
157/157, views suite 1141/1141.
Co-authored-by: multica-agent <github@multica.ai>
* feat(views/settings): wrap Lark in a named section inside Integrations (MUL-2671)
Integrations is meant to host multiple providers (Slack, Linear etc. as
they land), so the Lark content should sit under a Lark heading rather
than fill the tab directly — otherwise the first additional integration
would feel like it broke the IA. Add a "Lark" / "飞书" section heading
above LarkTab using the same h2 chrome the other settings tabs use, and
pin lark.section_title across all four locales (parity 169/169).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: J <j@multica.ai>
* feat(skills): introduce built-in agent skills (WIP)
Inject platform-authored, version-bundled skills into every agent on top of
its workspace-bound skills, so agents learn how to operate Multica correctly
without users needing to know the internals or agents needing to read source.
Mechanism: skills are embedded into the server binary and appended to the
agent payload at task-claim time (handler/daemon.go), reusing the existing
SkillData wire + daemon-side writeSkillFiles. The daemon needs no changes,
and because it travels over an existing wire field, older daemons pick the
skills up the moment the server ships.
First skill: multica-mentioning — how to build a working @mention (look up
the UUID, match type to id source, know what each mention type triggers).
WIP: injection mechanism + first skill only; more skills to follow in
dependency order (skill -> agent -> squad).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(skills): make multica-mentioning the standard template + add eval
Add the contract-skill frontmatter the other built-in skills will copy:
user-invocable:false (it triggers from context, not as a slash command)
and allowed-tools fencing it to the multica CLI it teaches. These keys
survive to agent machines untouched (ensureSkillFrontmatter only ever
adds a missing name).
Add a Go eval in builtin_skills_test.go (a _test.go so it never ships to
agent machines via the skill-files walk):
- Enforces the template invariants on every built-in skill, present and
future: multica- prefix, name+description present, description within
1024 chars, body within the 500-line L2 budget, no eval file leaking
into the shipped payload.
- Couples the mentioning skill's documented contract to the real
util.ParseMentions: its Incorrect examples must parse to nothing (a
name where a UUID belongs fails silently) and its Correct example must
fire. A drift in the mention regex now breaks CI instead of silently
turning the skill into a lie agents act on.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* feat(skills): add working-on-issues built-in skill
Co-authored-by: multica-agent <github@multica.ai>
* feat(skills): verify linked PRs in issue workflow skill
Co-authored-by: multica-agent <github@multica.ai>
* feat(skills): add skill import and discovery built-ins
Co-authored-by: multica-agent <github@multica.ai>
* feat(skills): add skill authoring built-in
Co-authored-by: multica-agent <github@multica.ai>
* docs(skills): align builtin skill workflows
Co-authored-by: multica-agent <github@multica.ai>
* docs(skills): use structured skill search
Co-authored-by: multica-agent <github@multica.ai>
* fix(skills): make built-in skill bundle launch-ready
Co-authored-by: multica-agent <github@multica.ai>
* fix(skills): align built-ins with additive skill binding
Co-authored-by: multica-agent <github@multica.ai>
* feat(skills): add creating agents built-in skill
Co-authored-by: multica-agent <github@multica.ai>
* Add built-in squads skill
Co-authored-by: multica-agent <github@multica.ai>
* refactor(skills): rewrite built-in skills as source-traced contracts
Rewrite the built-in agent skills to the inbuilt-skill-authoring standard:
state source-traced product facts with the source-code link logic as the
core, not prescriptive how-to coaching.
- creating-agents: drop the Decision-flow / Do-don't-consequences
methodology; replace with field/behavior contracts (validation, persisted
shape, daemon claim-time consumption, env gating, skill binding).
- skill-discovery: stop teaching repo/github_stars as selection signals —
searchClawHubSkills never populates them (always null); rank by
install_count + source/url + description. Add file:line citations.
- mentioning: drop the unbacked "member mention sends a notification" claim
(no such path in the comment handler); state that only agent/squad
mentions enqueue work. Tighten the parser-failure wording.
- working-on-issues: refresh citations drifted by the main merge; describe
the PR response `state` enum accurately; trim status coaching.
- skill-importing: correct response type to SkillWithFilesResponse; document
the reserved SKILL.md supporting-file rule; add line-accurate citations.
- squads: correct the "leader cannot be archived" overstatement (not
rejected at create/update; fails closed later at routing/dispatch);
refresh source-map attributions and test list.
Each skill now ships references/<skill>-source-map.md as its evidence layer
(line-accurate citations live there, not pinned in the test, so a future
main merge cannot rot them into stale lies).
builtin_skills_test.go: replace coaching/line-number pins with drift-resistant
contract anchors, forbid the coaching phrasing, and require every skill to
ship its source-map. The ParseMentions behavior coupling is preserved.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* docs(skills): close field-role and citation gaps found in review
Independent review of the rewritten built-in skills surfaced two real gaps and
some citation drift; this fixes them.
- creating-agents: add the three missing field rows (visibility,
max_concurrent_tasks, mcp_config) to the field-contract table — mcp_config is
runtime-consumed (TaskAgentData, daemon.go), visibility is access-control
(default private), max_concurrent_tasks is a scheduler cap (default 6).
Mark custom_args/runtime_config JSON validation as CLI-side (the server
marshals as-is). Correct the CLI body-builder note (description/instructions
use a non-empty check, the rest use Changed). Source-map: fix the env query
name (UpdateAgentCustomEnv), the conformance test name, and add the new field
defaults + the McpConfig runtime-payload line.
- mentioning: the @squad mention private gate is canAccessPrivateAgent, not
canEnqueueSquadLeader (that wrapper is the assignment/child-done path).
- working-on-issues: cite notifyParentOfChildDone at its func def (:51), not the
doc comment (:15).
- skill-importing: config.origin is set only when the source supplied an origin
— note it may be absent; cite createSkillWithFiles at its definition
(skill_create.go:72), not the call site.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* Add built-in skills for autopilots runtimes and resources
Co-authored-by: multica-agent <github@multica.ai>
* feat(runtime): list skill descriptions in the brief Skills index
The brief's `## Skills` section emitted bare skill names only, discarding
the one-line description that SkillContextForEnv already carries. For
Claude-family providers the frontmatter description is loaded natively;
for providers without native skill discovery (hermes/default) the brief's
list is the only signal they ever see, so a bare name gave them nothing
to decide when to load a skill. Emit `name — description` when a
description is present, falling back to the bare name when it is empty.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* refactor(skills): drop CLI-only rule from working-on-issues
The "Platform data goes through the CLI" section duplicated the runtime
brief's `## Important: Always Use the multica CLI` section verbatim (and
the attachment-via-CLI note duplicated the brief's `## Attachments`). The
CLI-only rule is universal and must be known before any skill loads, so
the brief is its single source of truth; the skill copy was pure
redundancy and a drift risk. Remove it and the matching intro clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* refactor(skills): remove discovery guidance from built-ins
* docs(skills): remove stale skill-necessity records
The per-skill necessity records had drifted to 3 of 8 shipped skills plus a
record for `multica-skill-authoring`, which is not a shipped built-in skill.
Per-skill "why it exists / when to use it" already lives co-located with each
skill (frontmatter `description` + `references/<skill>-source-map.md`) where it
cannot drift from the skill, and the doc's methodology duplicated the
workspace's inbuilt-skill-authoring protocol. Remove the file rather than keep a
parallel listing that every new skill has to remember to update.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* feat(runtime): add source-authority escape hatch to the brief
The brief already tells agents to run `--help` for command discovery, but
nothing stated the trust precedence when a skill, the brief, or a doc seems to
contradict actual behavior. Add one line to the Available Commands escape-hatch
note: trust the live CLI (`--help`/`--output json`) and the checked-out source
over source-traced prose that can lag the code, and verify on any conflict or
confusion. Kept in the always-on brief (universal, needed before any skill
loads) rather than duplicated into each skill; per-skill source-map pointers
remain the specific layer.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(runtime): scope the source-authority escape hatch to the CLI
The previous version told agents the "checked-out source is the deeper
authority" for verifying behavior. That over-claims: the repos in a task's
brief come from GetWorkspaceRepos + project github_repo resources (per-workspace
config, see daemon.registerTaskRepos), not the Multica platform source. A
generic agent's checked-out source is its own app, not Multica's code, so it
cannot verify a Multica skill/brief claim against it. The only universally
available authority for Multica behavior is the live CLI (`--help` /
`--output json` / observed command behavior). Re-scope the line accordingly and
state plainly that the platform's source is not in the workdir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* revert(runtime): drop the source-authority escape-hatch line
Reverts the brief addition from fdd5e82df and its follow-up cc67b2088. The
`--help` discovery fallback already in the Available Commands note is enough;
the extra trust-precedence sentence was unnecessary. runtime_config.go is now
identical to 6ca27ad74.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* docs(claude): remind to update built-in skills on CLI/field/behavior changes
Add a Coding Rule: when a change touches a CLI command/flag, API field, or
product behavior that a built-in skill documents, update that skill's SKILL.md
and source-map in the same PR. Lives in the repo dev-guide (read when working in
this repo), not the runtime brief — the runtime brief is injected into every
workspace, where most agents have no Multica skill to update. AGENTS.md is a
pointer to CLAUDE.md, so no mirror needed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* feat(metrics): scrape-time BusinessSamplerCollector for active users / queued / runtime gauges (MUL-2947)
Adds an opt-in prometheus.Collector that runs a fixed set of read-only
SQL queries on every /metrics scrape and exposes the results as gauges:
- multica_active_users{window=5m|1h|24h}
- multica_active_workspaces{window=...}
- multica_agent_task_queued{source}
- multica_agent_task_running{source,runtime_mode}
- multica_agent_task_stuck_total{source}
- multica_runtime_online{runtime_mode,provider}
- multica_runtime_heartbeat_age_seconds{runtime_mode} (histogram)
- multica_workspace_total
Plus a self-introspection histogram
multica_business_sampler_query_seconds{name=...} and a counter
multica_business_sampler_query_errors_total{name=...} so the sampler's
own behaviour is observable on /metrics.
Production-safety contract per the PR4 brief:
- every query runs in its own BEGIN READ ONLY tx with
SET LOCAL statement_timeout = '500ms' (configurable)
- the sampler takes a dedicated *pgxpool.Pool option so operators
can isolate it from business traffic
- successful results are cached for 5–10s (default 8s) to absorb
concurrent scrapes from multiple Prometheus replicas
- every SQL has a hard LIMIT 100 fallback
- all label values flow through the existing BusinessMetrics
NormalizeTaskSource / NormalizeRuntimeMode / NormalizeRuntimeProvider
whitelists, so a misbehaving runtime cannot inflate cardinality
- sampler is OPT-IN via RegistryOptions.BusinessSampler — existing
callers that only pass Pool keep their current behaviour and never
start hitting the DB on /metrics
Tests cover: emit shape, TTL cache (one DB call per N scrapes),
bounded cardinality under malicious labels, opt-out (no leakage), and
DB-hang isolation (unreachable host -> /metrics returns within 5s,
query_errors_total advances).
Refs MUL-2947 (depends on PR2 / MUL-2948, merged in #3695).
Co-authored-by: multica-agent <github@multica.ai>
* fix(metrics): address PR4 review — wire sampler in main.go, fix LIMIT bug, add live-DB statement_timeout test
Three fixes from 大彪's review on #3706:
1. main.go was building NewRegistry without the BusinessSampler option,
so the collector was effectively dead code in prod. Now constructs a
dedicated 2-conn pgxpool (newSamplerDBPool) from the same DATABASE_URL
when METRICS_ADDR is set, plumbs it into RegistryOptions.BusinessSampler,
and defers Close() at shutdown. A pool-build failure logs and disables
the sampler instead of taking down the server.
2. queryActiveUsers / queryActiveWorkspaces previously wrapped the
distinct-user/workspace subquery in a 'LIMIT 100', then COUNT(*)'d
the result — capping the active-user gauge at 100 regardless of
reality. Removed the inner LIMIT; the COUNT scalar is one row anyway,
and metric cardinality is bounded by the fixed samplerWindows
allow-list, not by the SQL shape.
3. The previous DB-hang test only exercised the acquire-fails path. Added
business_sampler_pgsleep_test.go which connects to a live Postgres
(skips cleanly when DATABASE_URL is not set), runs SELECT pg_sleep(2)
inside a sampler-style tx with SET LOCAL statement_timeout = '500ms',
and asserts:
- the call returns in well under 1.5 s (proving the server-side
cancellation, not just our caller-side context)
- query_errors_total{name=pg_sleep_canary} advances
- the duration histogram records the cancellation
Verified locally: 550 ms, SQLSTATE 57014 'canceling statement due to
statement timeout' — exactly the safety net the PR claims.
Refs MUL-2947 / PR #3706.
Co-authored-by: multica-agent <github@multica.ai>
* test(metrics): assert SQLSTATE 57014 on pg_sleep cancellation
The previous assertion only checked that the query was cut off in well
under the sleep duration, which a caller-side context cancellation
would also satisfy. Capturing the inner pgconn.PgError and asserting
Code == "57014" ("query_canceled") nails down that Postgres itself
cancelled the statement because of the SET LOCAL statement_timeout —
so a regression that drops the SET LOCAL line fails this test loudly
instead of silently passing on context cancellation.
Refs MUL-2947 / PR #3706 review nit.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
The self-hosting docs covered the WebSocket Upgrade-proxy failure mode
but not the backend's Origin allowlist, which rejects WS upgrades from
non-localhost origins with a 403 (websocket: request origin not allowed
by Upgrader.CheckOrigin) unless CORS_ALLOWED_ORIGINS / FRONTEND_ORIGIN is
set to the external origin. Self-hosters hit this as "chat / live updates
only appear after a manual page refresh" (#3677).
- Note CORS_ALLOWED_ORIGINS governs both HTTP CORS and the WS origin check
- Add the env-var requirement to the single-domain Caddy example
- Add an "allowlist the browser origin" callout to the WebSocket
troubleshooting section, including the exact backend error string
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949)
PR3 of the Grafana board metrics split (parent MUL-2328).
Adds 23 new Prometheus counter/histogram families to the PR2 BusinessMetrics
collector covering the activation/community/commercial funnels, and binds
every PostHog event emission to a matching metric increment so the two sides
cannot drift.
Funnel: signup, workspace_created, team_invite_sent/accepted, onboarding_*,
cloud_waitlist_joined.
Content: issue_created, chat_message_sent, agent_created, squad_created,
autopilot_created, issue_executed.
Runtime: runtime_registered/ready/failed/offline + ready_seconds histogram,
daemon_ws_message_received_total.
Autopilot: autopilot_run_started/terminal/skipped.
Webhook/GitHub: webhook_delivery_total, github_event_received_total,
github_pr_review_total, github_pr_merge_seconds histogram.
CloudRuntime: cloudruntime_request_total + duration histogram, wired through
a small RequestRecorder interface so the cloudruntime package stays decoupled
from metrics.
Commercial: feedback_submitted, contact_sales_submitted.
The pairing helper metrics.RecordEvent(client, m, ev) emits the PostHog
event AND increments the matching counter via IncForEvent dispatch, reading
labels from the analytics event Properties. Every existing
h.Analytics.Capture(analytics.X(...)) call site has been migrated to the
helper across handler/, service/, and cmd/server/runtime_sweeper.go.
Lint enforcement (server/internal/metrics/business_pairing_test.go):
- TestEveryAnalyticsEventHasPrometheusCounter: every Event* constant in
analytics/events.go either dispatches via IncForEvent or is in the
taskMetricEvents allow-list (PR2 typed RecordTask* methods).
- TestNoNakedAnalyticsCaptureInHandlersOrServices: AST-walks handler/
service/cmd-server for direct Analytics.Capture(...) calls — only
service/task.go's captureTaskEvent helper is allow-listed.
- TestEveryAnalyticsRecordEventTakesAnalyticsHelper: validates the third
arg of every metrics.RecordEvent call is built from analytics.*.
Cardinality protection: all new label values pass through fixed allow-lists
in labels_pr3.go; unknown values collapse to 'other'/'unknown'/'error'.
Refs:
- Spec MUL-2328 / MUL-2949.
- Builds on PR2 (MUL-2948) — collectors registered through the same
BusinessMetrics struct, no separate Registry.
- Uses PR1's taskfailure.Reason (MUL-2946) for runtime_failed's failure_reason
label via NormalizeFailureReason.
Out of scope: Sampler-class metrics (PR4 / MUL-2947), pr_review_total
emission point (no review event handler exists yet — counter is defined,
TODO to wire up when /api/webhooks/github grows pull_request_review handling).
Co-authored-by: multica-agent <github@multica.ai>
* fix(server): tighten PR3 review items — signup_source bucket, fill platform/kind/form_source enums, onboarding_started server emission, lint scope (MUL-2949)
Addresses 张大彪's review on #3698:
1. signup_source: NormalizeSignupSource added to labels_pr3.go with a
fixed allow-list bucket (direct/google/twitter/linkedin/.../other).
Parses JSON cookie payload for utm_source/source/referrer fields,
strips URL schemes, maps well-known hostnames to channel buckets.
PostHog event still ships the raw cookie value for analytics; only
the Prometheus label is bucketed.
2. Filled the unknown/other label gaps:
- analytics.IssueCreated and analytics.ChatMessageSent now take a
platform parameter sourced from middleware.ClientMetadataFromContext
(X-Client-Platform header) at the handler. Autopilot-originated
issues stamp PlatformServer.
- analytics.FeedbackSubmitted now takes a kind parameter; CreateFeedback
reads req.Kind (default "general") so the picker selection lights up
the metric's kind label instead of long-term "other".
- analytics.ContactSalesSubmitted now takes a formSource (page /
onboarding / agents_page); CreateContactSales reads req.Source.
The metric reads ev.Properties["form_source"] so the analytics
CoreProperties.Source ("marketing_contact_sales") stays
backward-compat for PostHog dashboards.
3. analytics.OnboardingStarted helper added; server-side emission lives
in PatchOnboarding, fired exactly once per user on the first PATCH
that carries a non-empty questionnaire payload (firstTouch logic
compares prior bytes against {} / null). Frontend onboarding_started
keeps firing on page open; the server emission is what guarantees the
Prometheus counter exists so Grafana can be cross-checked against the
PostHog funnel without depending on the SDK roundtrip.
4. business_pairing_test.go tightened:
- TestNoNakedAnalyticsCaptureInHandlersOrServices now allow-lists at
function granularity (just captureTaskEvent in service/task.go), not
whole-file. Any future naked Capture in the same file fails CI.
- TestEveryAnalyticsRecordEventTakesAnalyticsHelper now does def-use
tracking inside the enclosing FuncDecl: when RecordEvent's third
arg is an *ast.Ident, the test walks the function body for the
assignment that defined it and confirms the RHS is an
analytics.<Helper>(...) call. Bare local idents that didn't
originate from analytics are now caught.
5. gofmt -w applied across the touched files; gofmt -l clean.
Tests: go test ./internal/metrics/... ./internal/analytics/... pass.
Pre-existing TestClaimTask_/TestWebhook_MergedPR/TestDeleteIssueByIdentifier
failures on origin/main are DB-environment-dependent and not regressions
from this change.
Co-authored-by: multica-agent <github@multica.ai>
* fix(server): normalise onboarding_started platform label + regression test (MUL-2949)
Addresses 张大彪's last review nit:
- IncForEvent's EventOnboardingStarted case now wraps the platform
property with NormalizePlatform, matching every other platform-bearing
metric. A misbehaving frontend can no longer leak a raw X-Client-Platform
header value into the multica_onboarding_started_total{platform=...}
series.
- New labels_pr3_test.go covers every PR3 normalizer with both a happy-path
value and an unknown value, asserting the unknown collapses to the
documented fallback bucket. Includes a focused regression for
onboarding_started: emits one event with an attacker-shaped platform
string and asserts the metric only exposes web + unknown label values
(no raw header bleed).
- testutil.go gains a small GatherForTest helper so the regression test
can pull the typed MetricFamily map without re-implementing the
registry-walk dance.
Co-authored-by: multica-agent <github@multica.ai>
* fix(server): NormalizeTaskSource on workspace_created + document lint limitations (MUL-2949)
Final review touch-ups before merge:
- IncForEvent's EventWorkspaceCreated case wraps source through
NormalizeTaskSource, matching the other source-bearing dispatches
(issue_created, agent_created, issue_executed). Closes the last raw
property leak in the dispatcher table.
- business_pairing_test.go inline docstrings now spell out the two
known limitations of the lint gate that 张大彪 / Eve flagged:
analyticsBackedIdents matches by ident NAME (not SSA def-use, so a
nested-scope shadow could pass) and isMetricsRecordEvent hard-codes
the import alias set. PR description carries a Follow-ups section
with the same two items so the work is visible after merge.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: 魏和尚 <agent+wei@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
cmd/migrate previously ran a check-then-apply loop on a *pgxpool.Pool
with no locking, so two backend pods starting at the same time (multi-
replica Deployment, scale-up, or a manual run overlapping with pod
startup) could both pass the EXISTS check on a pending migration and
race on the DDL or the schema_migrations INSERT, crashing the loser.
Take a single connection from the pool, hold a session-level
pg_advisory_lock for the entire migration loop, and release it on the
way out. We use the blocking variant so a late arriver queues behind
the current runner and then no-ops on the EXISTS checks instead of
crash-looping. The loop deliberately stays outside a transaction so
existing CREATE INDEX CONCURRENTLY migrations keep working.
Also refresh the values.yaml / backend.yaml comments next to
backend.replicas: the chart still ships replicas: 1 by default, but
that is now a recommendation (Recreate strategy, no leader split), not
a correctness requirement.
Refs https://github.com/multica-ai/multica/issues/3647
Co-authored-by: multica-agent <github@multica.ai>
* fix(issues): store start_date/due_date as DATE, not timestamp (MUL-2925)
These fields are calendar days (the pickers offer no time-of-day), but were
stored as TIMESTAMPTZ. A client serializing local midnight via toISOString()
folded its timezone into the instant, so the day shifted by the local offset
(GH #3618). Migrate the columns to DATE and parse/serialize date-only
"YYYY-MM-DD". ParseCalendarDate still accepts legacy RFC3339 (truncated to the
UTC day) so older clients keep working.
Co-authored-by: multica-agent <github@multica.ai>
* fix(issues): render start_date/due_date as timezone-stable calendar days (MUL-2925)
Pickers now emit date-only "YYYY-MM-DD" (local calendar day) instead of
toISOString(), and every read formats via the shared @multica/core/issues/date
helpers with timeZone:"UTC" so the day never shifts with the viewer's offset.
The Gantt's existing UTC bucketing is now correct. Covers web/desktop pickers,
quick-set menu, list/board/detail/activity, and the mobile due-date picker.
Co-authored-by: multica-agent <github@multica.ai>
* fix(issues): address date-only review — loud-fail ambiguous dates, finish display sweep (MUL-2925)
Review follow-ups on #3692:
- ParseCalendarDate no longer silently truncates a legacy non-midnight RFC3339
to the wrong UTC day; it accepts only YYYY-MM-DD or an exact UTC-midnight
instant and rejects ambiguous ones loudly. Adds util unit tests.
- migration 112 pins the TIMESTAMPTZ->DATE conversion to UTC explicitly via
AT TIME ZONE 'UTC' (was session-timezone dependent); down migration too.
- Convert remaining date-change display sites to formatDateOnly: inbox detail
label (web) and mobile activity + inbox labels (were new Date()+local format).
- CLI --start-date/--due-date help now says YYYY-MM-DD, not RFC3339.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Lift MUL-1949's offline backfill failure_reason taxonomy into a shared
in-flight classifier so the agent_task_queue.failure_reason column is
written with refined values (provider_auth_or_access, context_overflow,
provider_capacity_or_rate_limit, …) at write time rather than waiting on
SQL backfill to re-classify after the fact. PR1 of the Grafana board
plan in MUL-2328 — the upcoming PR2 reuses pkg/taskfailure.AllReasons()
to pre-warm the Prometheus failure_reason label set.
* server/pkg/taskfailure: new package with the canonical 21 Reason
constants (7 platform-side + 14 agent_error.* sub-reasons),
AllReasons() returning a defensive copy, IsAgentError() prefix check,
and Classify(rawError) Reason mirroring the SQL CASE rules from
MUL-1949 (db-boy's analysis). 100% statement coverage.
* server/internal/daemon/daemon.go: route the 'agent_error' coarse
fallback paths (StartTask error, runTask early-return error, CompleteTask
permanent rejection, reportTaskResult default branch) and the
executeAndDrain default error case (chained after classifyPoisonedError)
through taskfailure.Classify so blocked / timeout / unknown-status
results all carry a refined reason on the wire.
* server/internal/service/task.go: FailTask classifies errMsg when the
daemon-supplied failureReason is empty, eliminating the legacy
COALESCE(.., 'agent_error') landing.
* server/internal/daemon/poisoned.go: alias FailureReasonIterationLimit
and FailureReasonAPIInvalidRequest to the canonical taskfailure
constants. agent_fallback_message and codex_semantic_inactivity are
pre-existing operational reasons not in the canonical 21 — kept as
literals for now and revisited in a follow-up PR.
Backfill SQL from MUL-1949 stays as the authoritative offline source of
truth; this PR keeps the in-flight classifier in lock-step with the SQL
CASE expression so historical and future rows share the same taxonomy.
No behavior change for the platform-side reasons (queued_expired,
runtime_offline, runtime_recovery, timeout, etc.) which already align
with the canonical set.
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
* Optimize chat message loading
Co-authored-by: multica-agent <github@multica.ai>
* Fix chat history cursor pagination
Co-authored-by: multica-agent <github@multica.ai>
* Fix chat session list remount key
Co-authored-by: multica-agent <github@multica.ai>
* fix(chat): fall back to legacy /messages when paged endpoint 404s
Deployment-order compatibility: a backend deployed before the
/messages/page endpoint existed returns 404 for the unknown route.
The cursorless initial page now falls back to the legacy full-list
/messages endpoint and wraps it in a single has_more:false page, so
chat never white-screens regardless of which side deploys first. A 404
on a cursor request still propagates to avoid duplicating the full list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Newer opencode (1.15+) syncs its hosted free-model catalog over the
network on `opencode models`, which can take ~6s. The previous 5s cap
killed the command, discoverOpenCodeModels returned an empty list, and
the daemon reported it as a successful empty result — so the runtime
showed online but the model picker was empty ("暂无可用模型").
Fixes#3627
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(editor): support text highlight (==text==) in description & comments
Adds a single-color (yellow) text highlight mark to the shared rich-text
editor, round-tripped through stored Markdown as ==text==.
- HighlightExtension: @tiptap/extension-highlight + @tiptap/markdown hooks
(markdownTokenizer/parseMarkdown/renderMarkdown) so ==text== <-> <mark>
round-trips; inner inline formatting preserved via inlineTokens.
- Bubble menu: highlight toggle button (Mod-Shift-H), i18n in 4 locales.
- Read-only renderer: highlightToHtml lowers ==text== -> <mark> (skips code
and math); rehype-sanitize schema whitelists <mark>. Nested Markdown inside
a highlight still parses via the existing rehype-raw step.
- prose.css: single yellow <mark> style, legible in light/dark.
Pinned @tiptap/extension-highlight to exact 3.22.1 to match @tiptap/core
(>=3.23 expects a getStyleProperty export core 3.22.1 doesn't have).
Web/desktop only. Mobile (native md4c, no == syntax, no custom renderers)
is tracked as a follow-up. MUL-2934.
Tests: editor round-trip (cross-process serialization protocol), readonly
<mark> rendering + sanitize, and the ==->mark transform incl. code-skip.
Co-authored-by: multica-agent <github@multica.ai>
* fix(editor): align highlight boundary rules across editor & readonly
Addresses two boundary bugs from review (PR #3661):
1. A == inside inline code/math could close a highlight when the opening
== was outside the literal span (e.g. ==a `b==c` d== wrongly became
<mark>a `b</mark>c` d==). Both the editor tokenizer's lazy regex and the
readonly transform only guarded the opening fence, not the closing one.
2. The readonly transform matched across blank lines (==a\n\nb==) while the
editor lexes those as two literal paragraphs — a storage↔editor↔readonly
mismatch.
Fix: extract one shared matcher (utils/highlight-match.ts) used by BOTH the
editor tokenizer and the readonly lowering, so the rules can't drift. It skips
fences that fall inside code/math literal ranges (open or close) and caps the
inner span at the first blank line.
Tests: shared-matcher unit tests + both repros covered on the editor
(round-trip/HTML) and readonly (transform + rendered DOM) sides.
Co-authored-by: multica-agent <github@multica.ai>
* fix(editor): handle CRLF in highlight blank-line boundary
BLANK_LINE_RE only matched LF, so a CRLF blank line (==a\r\n\r\nb==) was not
recognized as a block boundary and got highlighted. Widen to \r?\n[ \t]*\r?\n.
Tests: CRLF blank-line (no highlight) + CRLF soft-break (still highlights) on
the matcher, readonly transform, and editor sides.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
A skill_file row whose path is the skill's own SKILL.md (persisted by
older builds or direct create/update API calls) collides with the
primary content the daemon writes itself, failing task prep with
errPathPreExists on every non-codex local runtime (#3489).
#3526 guarded this with strings.EqualFold(path, "SKILL.md") at the
daemon write site and the three API ingress points, but the stored path
is not canonicalized: "./SKILL.md" or "sub/../SKILL.md" slip past the
exact-match guard while filepath.Join still resolves them onto the same
SKILL.md, so prep can still break.
Extract one canonical helper, skill.IsReservedContentPath, that cleans
the path before the case-insensitive compare, and use it at all four
sites (execenv writeSkillFiles, skill create, update, single-file
upsert). Add a daemon-side regression test for writeSkillFiles ignoring
a bundled SKILL.md (exact + "./" spellings) — the load-bearing fix
previously had only API-layer coverage — plus a unit test for the helper.
Existing poisoned rows are intentionally left in place (skipped at prep)
per the decision on MUL-2928.
MUL-2928
Follow-up to #3526; supersedes #3560.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904)
Agents inlining a backtick-wrapped token into `multica issue comment add
--content "..."` had the shell run it as a command substitution, silently
deleting the token; the stored comment never matched the model's intent, so
it retried forever — spamming OKK-497 with duplicate comments.
The corruption is shell-driven, not provider-driven, so extend the
"never inline --content; use --content-file / quoted-HEREDOC --content-stdin"
rule from Codex-only to ALL providers:
- BuildCommentReplyInstructions: collapse the Linux/macOS non-Codex inline
branch into the unified quoted-HEREDOC stdin template.
- buildMetaSkillContent: rename "Codex-Specific Comment Formatting" ->
"Comment Formatting" and emit it for every provider; strengthen the
Available Commands entry and the assignment step-6 examples to steer away
from inline --content.
- Windows behavior unchanged (file-only; avoids PowerShell ASCII drop).
Tests: flip the non-Codex Linux reply test into a MUL-2904 regression,
broaden the stdin-emphasis test across providers, and pin the
provider-agnostic guardrail.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): keep Windows assignment brief file-only (address review)
Review catch on #3654: the previous commit added platform-agnostic prose
recommending "--content-file or --content-stdin" in the Available Commands
entry and the assignment-triggered step-6 example. The assignment path has
no BuildCommentReplyInstructions OS override, so on Windows an agent following
step 6 literally would pipe its final comment through PowerShell and drop
non-ASCII bytes (#2198 / #2236 / #2376) — contradicting this PR's own
Windows file-only rule in the ## Comment Formatting section.
Make the platform-agnostic surfaces defer to the OS-aware ## Comment
Formatting section (the single source of truth) instead of naming stdin.
The flag synopsis still lists all three modes.
Add TestInjectRuntimeConfigWindowsAssignmentBriefStaysFileOnly: a Windows
assignment-triggered brief must not contain any prescriptive "... or
--content-stdin" recommendation.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix: autopilot page and modal mobile responsive
* fix(autopilots): label icon-only action buttons and keep desktop padding
- Add aria-label to Edit/Run now buttons so they have an accessible
name on mobile where the text label is hidden via 'hidden sm:inline'.
- Change button padding 'px-2 sm:px-3' -> 'px-2 sm:px-2.5' so the
size="sm" default (px-2.5) is preserved on desktop (no visual diff).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J (Multica Agent) <agent-j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Adds a value (default true for backward compatibility) that gates the
uploads PersistentVolumeClaim, the backend container's volumeMount, and
the pod-spec volume. Operators who serve uploads from S3 (S3_BUCKET set)
can now set backend.uploads.persistence.enabled=false to drop the PVC
entirely, removing the ReadWriteOnce Multi-Attach barrier on the storage
side for replicas > 1.
Also makes the PVC accessModes configurable (default [ReadWriteOnce]) so
operators with a ReadWriteMany-capable StorageClass can share the
uploads volume across replicas without object storage.
Documentation: values.yaml comments and the SELF_HOSTING.md resource
list are updated to describe the new toggle.
Refs: https://github.com/multica-ai/multica/issues/3646
Co-authored-by: multica-agent <github@multica.ai>
* fix(issues): validate and clamp limit/offset in ListIssues (MUL-2847)
ListIssues parsed the limit and offset query params but never validated
them, so:
- GET /api/issues?limit=-1 -> HTTP 500 (Postgres rejects negative
LIMIT with SQLSTATE 2201W)
- GET /api/issues?limit=100000000 -> unbounded read in a single
response
- GET /api/issues?offset=-1 -> same 500
SearchIssues and ListGroupedIssues already apply v > 0 + an upper clamp
on limit and v >= 0 on offset. This brings ListIssues to the same
pattern: ignore non-positive limit (keep default 100), clamp to 100,
ignore negative offset (keep default 0). default == clamp == 100 keeps
existing callers' behavior identical and matches the upstream issue
suggestion.
TestListIssues_LimitValidation seeds 3 issues in a dedicated project
and pins the nine boundary cases (negative/zero/huge/non-numeric
limit, negative/non-numeric offset, the clamp boundary, and explicit
small/positive-offset sanity) plus two sanity checks that an explicit
small limit and a positive offset are honored.
Fixes MUL-2847 / upstream multica-ai/multica#3563.
Co-authored-by: multica-agent <github@multica.ai>
* test(issues): strengthen LimitClamp test and fix comments (MUL-2847)
Address review feedback from @Lambda and @Emacs on PR #3585:
1. The 3-row set in TestListIssues_LimitValidation can't distinguish
'clamp fired' from 'clamp missing': with only 3 rows, limit=100000000
returns 3 rows whether or not the clamp exists. Split the clamp
behavior into a new TestListIssues_LimitClamp that seeds 101 issues
and asserts len(issues) == 100 for limit=100/101/200/100000000, plus
limit=50 honored below the clamp. Without the clamp line, the
huge/above-clamp subtests would fail with len == 101.
2. Fix the misleading comment that claimed 'limit=0 -> same 500'.
Postgres LIMIT 0 is valid SQL and returns zero rows. The guard
exists for sibling-consistency (SearchIssues / ListGroupedIssues
already treat v <= 0 as 'use default'), not to avoid a 500. Move
the limit=0 case out of TestListIssues_LimitValidation since it's
not 500-related; TestListIssues_LimitClamp's 'no limit returns
default page of 100' subtest pins the default behavior anyway.
3. Add a subtest that pins the offset+clamp composition
(limit=200&offset=50 against 101 rows = 51 rows), proving the
clamp caps the page size while offset still indexes the full
result set.
4. Fix gofmt: the original file's leading-bullet comment indentation
was off by two spaces; gofmt -l now reports clean.
All 14 subtests across both functions pass; full ./internal/handler/
suite still passes (3.2s).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(onboarding): backfill prompt for users missing source attribution
Adds a one-shot popup shown after login to already-onboarded users
whose `onboarding_questionnaire.source` was never recorded — either
they completed onboarding before the source step shipped, or they
clicked Skip on it. Reuses the existing 12-option StepSource UI and
the existing `PATCH /api/me/onboarding` endpoint, so no schema or
backend changes.
Web renders it as a route at /onboarding/source (sibling of the
reserved /onboarding); desktop dispatches it as a WindowOverlay per
the Route categories rule. Submit and explicit Skip are terminal;
the close X bumps a per-user localStorage counter and stops appearing
after 3 dismissals.
Emits source_backfill_shown / submitted / skipped / dismissed PostHog
events so the funnel can be tracked separately from first-time
onboarding.
For MUL-2796.
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): preserve role/use_case and respect dismiss cap in source backfill
Round-2 fixes from Emacs's review of #3550:
1. PATCH wipe: `PATCH /api/me/onboarding` replaces the JSONB column
wholesale (server/internal/handler/onboarding.go), so sending only
the source slots was wiping role/use_case/version for exactly the
historical users this targets. Read user.onboarding_questionnaire,
overlay the source fields client-side via mergedQuestionnairePatch,
and send the full shape. 7 unit cases cover the merge semantics.
2. Legacy single-string source: pre-multi-select rows wrote
`source: "search"` as a bare string. needsSourceBackfill now treats
that as already answered, matching mergeQuestionnaire (views) and
stringOrSlice.UnmarshalJSON (server). Flipped the existing test and
added empty-string + null coverage.
3. Dismiss cap honored in callback: the web auth callback was passing
dismissCount=0, which would force-route capped users through
/onboarding/source on every login (the route page would bounce them
onward, but only after a blank detour and a re-fired
`source_backfill_shown` event). Added readSourceBackfillDismissCount
so the callback reads the same per-user localStorage bucket the
prompt writes to. Test asserts a count of 3 bypasses the detour.
Co-authored-by: multica-agent <github@multica.ai>
* test(onboarding): clear source-backfill dismiss counter in callback test beforeEach
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): footer hint text matches the Submit button on the backfill prompt
The Source step's hint reads "Hit Continue when you're ready" because
its commit button is "Continue". The backfill view ships a "Submit"
button instead, so the inherited hint was misleading. Add a dedicated
`source_backfill.hint_ready` key across en / zh / ko and use it here.
Caught during browser E2E in the round-2 verification stack.
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): magic-code login also detours through source backfill
The round-2 fix in PR #3550 only wired the source-backfill detour
into the OAuth `/auth/callback` post-success path. Magic-code login
goes through `/login` → `handleSuccess()` which calls
`resolveLoggedInDestination()` and pushes directly to the workspace,
so those users never reach `/onboarding/source`. Caught during the
local-env demo for Jiayuan.
Add `maybeSourceBackfillDetour` to the login page and apply it in
both the already-authenticated useEffect and the post-verify-code
handler. Predicate consults the same per-user localStorage bucket
the prompt writes to, so a user who hit the close-X cap on this
browser flows straight through.
Co-authored-by: multica-agent <github@multica.ai>
* refactor(onboarding): source backfill is a workspace-mounted modal, not a route detour
Per UAT, the prompt should overlay the workspace as a Dialog with the
workspace visible behind a dimmed backdrop — the original brief and
reference screenshot both showed a modal. PR #3550 shipped a full-window
takeover (web /onboarding/source + desktop WindowOverlay) which Jiayuan
rejected.
This commit replaces the full-window view with a Dialog-based
`<SourceBackfillModal />` mounted once inside the shared `DashboardLayout`
(packages/views/layout). The modal self-mounts: it reads
`needsSourceBackfill(user, dismissCount)` and opens itself when the
predicate flips to true; X / ESC / outside-click all bump the per-user
localStorage cap and close.
Removed:
- apps/web/app/(auth)/onboarding/source/page.tsx (route)
- paths.sourceBackfill (no longer needed)
- callback page detour
- login page maybeSourceBackfillDetour
- desktop WindowOverlay type "source-backfill"
- desktop navigation interception of /onboarding/source
- desktop App.tsx dispatch effect
- pageview-tracker case
- views/onboarding `SourceBackfillView` + `readSourceBackfillDismissCount` exports
Preserved (semantics unchanged):
- `needsSourceBackfill` predicate (incl. legacy single-string source coercion)
- `mergedQuestionnairePatch` so role / use_case survive Submit / Skip
- PostHog events: source_backfill_shown / submitted / skipped / dismissed
- Per-user dismiss-count cap (3) in localStorage
- en / zh / ko i18n strings
Tests:
- 7 new tests for the modal in packages/views/onboarding/source-backfill-modal.test.tsx
- Adjusted apps/web/app/auth/callback/page.test.tsx: detour tests dropped,
one assertion remains that onboarded users with missing source land in
the workspace (the modal handles the rest)
- Full suite: 965 tests pass, typecheck + lint clean
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): mount source-backfill modal on the desktop workspace too
Desktop's WorkspaceRouteLayout never wraps DashboardLayout, so the
previous commit's modal mount only fired for web. Regression: desktop
users were not seeing the prompt at all.
Wire the same `<SourceBackfillModal />` next to `<WelcomeAfterOnboarding />`
inside `workspace-route-layout.tsx`, with the matching
`!overlayActive` suppression so the Dialog doesn't portal-jump above
an active pre-workspace WindowOverlay (onboarding / accept-invite /
new-workspace). Same component on both platforms — single source of
truth lives in packages/views/onboarding/source-backfill-modal.tsx.
Also drop the now-stale `source-backfill detour` comment in the web
callback test fixture (Emacs nit, non-blocking).
Co-authored-by: multica-agent <github@multica.ai>
* test(desktop): assert workspace-route-layout mounts source-backfill modal
Two structural tests pinning the round-4 fix:
- `mounts SourceBackfillModal when no WindowOverlay is active` —
guards against the regression Emacs caught (modal silently absent
on desktop because the previous round only wired DashboardLayout).
- `suppresses SourceBackfillModal while a WindowOverlay is active` —
mirrors the existing `!overlayActive` rule that WelcomeAfterOnboarding
already relies on so a portal-rendered Dialog can't visually outrank
an active pre-workspace overlay.
Mocks the SourceBackfillModal with a marker component so the test
asserts mount/unmount without depending on the modal's own predicate
gate.
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): backfill modal Other toggles off; entrance settles after 700ms
UAT round-3 follow-ups from Jiayuan:
1. **Other can't be deselected**: the modal kept a parallel
`pendingOther` flag set to true on every Other click, and
`IconOtherOptionCard`'s row click was guarded with
`if (!selected) onSelect()` — so a second click neither flipped
pendingOther nor reached the parent toggle. Drop `pendingOther`
(the `source.includes("other")` derivation is already authoritative)
AND add an opt-in `allowToggleOff` prop to `IconOtherOptionCard`
that lets the row toggle when already selected. The text input
stops click propagation so typing never deselects.
2. **Rebase + absorb GitHub channel**: rebased onto origin/main which
added `social_github` (PR #3612). Modal's option list now mirrors
StepSource — GitHub slotted between YouTube and Other social,
reusing the existing `GitHubIcon`.
3. **Soft entrance**: defer the dialog open by 700ms after the user
lands on a workspace so the underlying view paints first and the
modal feels like an inviting prompt rather than a hard block.
Honour `prefers-reduced-motion: reduce` (open immediately for
users who have opted out of incidental motion).
Tests:
- New `Other toggles off on the second click instead of getting stuck`
- New `renders the GitHub channel rebased from origin/main`
- New `defers the entrance by ~700ms when the user has not opted into
reduced motion`
- Existing tests stamp `prefers-reduced-motion: reduce` in beforeEach
so the dialog opens synchronously and they don't need to drive
fake timers.
Full suite passes (969 tests).
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): backfill modal opens reliably + Other deselects via icon area
Three follow-up fixes after live UAT:
1. Strict-mode regression on entrance delay: the gate ref was being
stamped when the effect *scheduled* the timer, so React Strict
Mode's double-invoke cleared the first timer and then bailed on
the second pass because the ref was already set, leaving the
dialog forever closed. Stamp the ref only inside the timer
callback (or synchronously when reduced-motion is on) so the
second strict pass starts a fresh timer.
2. Other deselect: dropping `pendingOther` wasn't enough — the input
that replaces the label when Other is selected was previously
stopping click propagation, so a re-click on the row never
reached the toggle. Remove `e.stopPropagation()` and instead let
the row's onClick ignore clicks whose target IS the input
(typing / focusing the input still doesn't deselect; clicks on
the icon, padding, or border do).
3. Tests: drive the Other re-click via Playwright `click({position:
{x:24,y:24}})` so the click lands on the icon area instead of the
center of the input, matching real-user behaviour.
Co-authored-by: multica-agent <github@multica.ai>
* refactor(onboarding): source picker is single-select primary source
Per Jiayuan's call after the survey of HDYHAU UX in PLG SaaS (Linear /
Vercel / Loom / Notion / Webflow / Stripe / Figma / Cursor / PostHog
mostly skip the question entirely; where it's asked the documented
default — Fairing / Recast / HockeyStack / Ruler Analytics — is to
capture the primary source so channel weights sum to 100% and ROI
math is defensible).
Modal + StepSource both pivot from multi-select to single-select
radio. Server schema is intentionally untouched: `source` stays
`string[]` for back-compat with v2 multi-select rows; the client
always sends a one-element array. Zero migration, zero data loss.
Frontend:
- `source-backfill-modal.tsx`: state pivots from a multi-element
`source: Source[]` to a single `pickedSlug` derived from
`source[0]`; click handler replaces the array instead of toggling.
Cards switch to `mode="radio"`, the fieldset gets `role="radiogroup"`,
the now-redundant `pendingOther` and `allowToggleOff` opt-in go
away — radio mode means no toggle-off, so the original UAT bug
("Other can't be deselected") is structurally impossible.
- `step-source.tsx`: drop the `multiSelect` prop so it routes
through `step-question.tsx`'s existing radio path (same one
StepRole already uses). Picking a second option replaces the
first; switching away from Other clears `source_other` so a stale
value can't leak.
- `icon-option-card.tsx`: revert the `allowToggleOff` plumbing.
Tests:
- `source-backfill-modal.test.tsx`: drop the multi-select toggle-off
assertion; add "picking a second option replaces the first" with
explicit radio-role queries.
- `step-source.test.tsx`: rewrite multi-select tests as single-select
(no more "stacks several picks" / "toggle off" cases); add
"switching away from Other clears source_other".
Full suite (970 tests) green, typecheck + lint clean.
Co-authored-by: multica-agent <github@multica.ai>
* docs(onboarding): refresh stale multi-select comments around source
Comment-only follow-up to the single-select refactor in d14f9d09f.
Five docblocks still described `source` as multi-select; they now
correctly say single-select and explain the array shape is kept
purely for v2 back-compat with the JSONB column.
- packages/core/onboarding/types.ts — QuestionnaireAnswers docblock
- packages/core/onboarding/store.ts — PostHog mirror comment
- packages/views/onboarding/steps/step-question.tsx — header docblock,
canContinue branch, and footer-hint comment (Source moves from the
multi-select side to the single-select side; Use case stays as the
remaining multi-select consumer)
- server/internal/handler/onboarding.go — questionnaireAnswers docblock
and the stringOrSlice fall-back comment (the column "going multi-
select" is no longer the current state; rename to "pre-array shape")
- server/internal/analytics/events.go — OnboardingQuestionnaireSubmitted
docblock
No behaviour changes. Tests + Go build still green.
Co-authored-by: multica-agent <github@multica.ai>
* i18n(onboarding): add ja translations for source-backfill keys
The Japanese locale landed on main (PR #3538) after this branch
started, so my source-backfill round-2 keys (`common.close`,
`source_backfill.eyebrow / lede / submit / hint_ready`) never made
it into ja and the parity test fails in CI. Add them now with
translations that match the en/zh-Hans/ko wording and tone.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(editor): add / slash-command palette for invoking agent skills
Adds a `/` trigger in the chat box that opens a popover listing the active
agent's skills. Selecting an item inserts a `[/label](slash://skill/<id>)`
token; the daemon extracts those IDs in `buildChatPrompt` and emits an
"Explicitly selected skills:" block using the canonical names from the
agent's skill registry — labels are display-only and never trusted.
Built on Tiptap's `Mention` extension so the suggestion lifecycle,
keyboard routing, and IME handling mirror the existing `@` mention UX.
Item list is sourced from the React Query workspace cache (no per-keystroke
fetch). Gated behind a new `enableSlashCommands` prop so only `chat-input`
opts in; other `ContentEditor` consumers (issue editor, comments) are
unaffected. Read-only markdown surfaces render the token as a `.slash-command`
pill via a custom link renderer + sanitize-schema/url-transform allowlists.
Closes#3108
* fix(i18n): add slash_command editor copy for ko/ja
The PR added slash_command popover empty-state keys to en + zh-Hans only;
locales/parity.test.ts requires every locale to cover every EN key, so ko
and ja failed CI. Add the two keys (no_skills_configured, no_results)
matching existing skill terminology (스킬 / スキル).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Naiyuan Qing <145280634+NevilleQingNY@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: gate private squad leader from being triggered by unauthorized members
Add canEnqueueSquadLeader helper that checks canAccessPrivateAgent before
allowing a squad leader to be enqueued. Gate all EnqueueTaskForSquadLeader
call sites:
1. enqueueSquadLeaderTask (comment trigger, assign trigger, backlog→todo)
2. triggerChildDoneSquad (child-done → parent squad leader)
3. autopilot.go (defensive comment; actor is always agent → always passes)
Also fix validateAssigneePair's squad branch to run canAccessPrivateAgent
on the squad leader, returning 403 'cannot assign to squad with private
leader' when the actor lacks access.
Thread actorType/actorID through notifyParentOfChildDone →
dispatchParentAssigneeTrigger → triggerChildDoneSquad so the child-done
path can enforce the private-leader gate.
Regression tests:
- Plain member blocked from create-issue to private-leader squad (403)
- Plain member blocked from update-issue to private-leader squad (403)
- Owner allowed to assign private-leader squad
- Plain member comment on squad-assigned issue doesn't trigger private leader
- Child-done by plain member doesn't trigger parent's private leader
- Agent actor can still trigger private leader via comment
Closes MUL-2860
Co-authored-by: multica-agent <github@multica.ai>
* fix: add private-leader gate to autopilot save + dispatch paths
- validateAutopilotAssignee squad branch: call canAccessPrivateAgent on
the leader, returning 403 for unauthorized members at save time.
- service/autopilot.go: add canCreatorAccessPrivateLeader helper that
mirrors the handler-level canAccessPrivateAgent logic (agent creators
pass; member creators must be owner/admin or agent owner).
- Gate both dispatch paths (dispatchCreateIssue and dispatchRunOnly)
with fail-closed check: if leader is private and creator lacks access,
the run is skipped instead of triggering the private leader.
Regression tests:
- Plain member create autopilot to private-leader squad → 403
- Plain member update autopilot to private-leader squad → 403
- Owner create autopilot to private-leader squad → 201
- Owner-created autopilot dispatch → issue_created (positive)
- Legacy plain-member-created autopilot dispatch → skipped (fail-closed)
Co-authored-by: multica-agent <github@multica.ai>
* test: add run_only legacy private-leader squad dispatch regression test
Covers the dispatchRunOnly path explicitly, complementing the existing
create_issue dispatch test. Both dispatch branches now have direct test
coverage for the private-leader fail-closed gate.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* fix(desktop): contain renderer crashes
Co-authored-by: multica-agent <github@multica.ai>
* fix(desktop): filter renderer exit prompts
Co-authored-by: multica-agent <github@multica.ai>
* refactor(desktop): drop redundant page-level ErrorBoundary on issue detail
The whole-page <ErrorBoundary> wrapper duplicated the new route-level
errorElement (DesktopRouteErrorPage). Let render errors bubble to the
root route boundary so all detail routes are contained the same way.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(desktop): add Close tab escape to route error page
Reload tab recreates the same crashing path and Go to issues is a dead
end when the issues route itself crashed. Add a Close tab action that
destroys the crashing router entirely and falls back to a sibling tab
(or a reseeded default), the only always-safe escape regardless of
which route crashed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* i18n: add japanese locale
* fix: spacing issues
* refactor
* fix(desktop): set <html lang> before paint to avoid JA Kanji font flash
Switch the documentElement.lang sync from useEffect to useLayoutEffect so
lang is committed before the first paint. Otherwise Japanese desktop users
saw one frame of Kanji rendered with the Chinese-first fallback stack before
the html[lang|="ja"] CJK override applied. Also fix the stale selector in the
HTML_LANG comment (html[lang^="ja"] -> html[lang|="ja"]).
Addresses review nits on MUL-2893.
Co-authored-by: multica-agent <github@multica.ai>
* fix(docs): tokenize the ideographic iteration mark in JA search
Add U+3005 (々) to the Japanese search tokenizer character class. It sits just
below the kana blocks, so words like 様々 / 日々 / 個々 previously dropped the
mark and split awkwardly, hurting recall.
Addresses a review nit on MUL-2893.
Co-authored-by: multica-agent <github@multica.ai>
* fix(i18n): restore ja locale parity after merging main
Merging main brought new EN strings into agents/chat/onboarding/settings/
squads that the ja bundle (authored against an older snapshot) lacked, breaking
the locales parity test. Add the Japanese translations for the new keys
(workspace logo upload, agents runtime filter, chat session-history stop
dialog, onboarding social_github, squad archived status) and drop the two
renamed chat window keys (active_group / archived_group) that EN removed in
favour of history_group.
Fixes the failing @multica/views parity.test.ts on the FE CI for MUL-2893.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(workspace): recover from stale workspace state
* fix(workspace): apply review nits for recovery flow
- no-access-page: navigate via nav.replace so a browser Back doesn't
land the user back on NoAccessPage with the dead slug
- no-access-page: refresh the stale cookie-clear comment — the recovery
button no longer routes through `/`; the clear now guards other `/`
entry points (manual nav, Back into `/`, fresh page load)
- tab-store: drop the redundant `as string | undefined` cast (the Set
value is already string | undefined under TS 5.9)
- tab-store.test: cover the route-layout heal path (all stale groups
dropped, then seed a fresh tab for a valid slug) and assert the
dropped group's router is disposed
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Adds OpenCode model variant discovery for thinking controls, passes saved thinking_level through opencode run --variant, and hardens verbose model parsing with fallback coverage.
Retired agents (agent.archived_at set) previously read as offline across
the agent dot, hover card, detail badge, and squad member list — a
leftover online runtime row could even make them look reachable. Add a
dedicated archived presence/status that wins over every runtime/task
signal so a retired agent never reads as live or merely offline.
- Add archived to AgentAvailability and SquadMemberStatusValue unions
- Short-circuit deriveAgentPresenceDetail before runtime/task scan
- Backend deriveSquadMemberStatus returns archived instead of offline
- Render gray Archive dot/label; skip workload + reassign affordances
- en/ko/zh-Hans locale strings
Backfill the missing query invalidations (chat / labels / invitations) in invalidateWorkspaceScopedQueries, so those lists refresh on WS reconnect and workspace switch instead of showing stale data until a manual refresh.
Adds tests covering invalidation on ws-instance change and actor_type passing to event handlers.
MUL-2882
* refactor(chat): rework chat history list
- Drop legacy archived sessions from the history dropdown. The
soft-archive feature was removed, so status='archived' rows are dead
data; exclude them instead of showing a collapsed "archived" group.
Rename the section heading "Active" -> "Chat history".
- Swap hover row actions into the status column's slot instead of an
absolute overlay: status is hidden on hover and actions take its
place inline, while the title keeps flex-1. No mid-row gap, no
overlap, no text bleed-through.
- Remove orphaned i18n keys (active_group, archived_group,
archived_label) across en/zh-Hans/ko.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(issues): align execution log rows with the chat hover-swap pattern
- Drop the fixed w-20 status column that forced premature truncation of
the trigger text and left a mid-row gap; status now sizes to content.
- Running tasks render only the spinner (sr-only label retained for a11y
and tooltip); the redundant "Working" text is removed.
- Hover swaps status for actions in place (RowStatus hidden, RowActions
inline) instead of an absolute gradient overlay. Applies to both
active and past ("show past runs") rows via the shared RowShell /
RowStatus / RowActions.
Known tradeoff: dropping the absolute+opacity slot also drops the
group-focus-within keyboard reveal, so cancel/retry are no longer
Tab-reachable. Matches the chat pattern; revisit if keyboard access for
row actions becomes a requirement.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bug 1: detect copilot.cmd/.bat on Windows and invoke the sibling .ps1 directly via powershell -File, bypassing cmd.exe %* re-tokenisation that mangled the multi-line -p prompt. Shared rewriteCmdToPS1() now serves cursor, pi, and copilot.
Bug 2: filterCustomArgs (shared by all agent backends) strips one outer layer of shell quotes via unshellQuoteArg() before processing, so shell-style custom args like --deny-tool='write' no longer reach the CLI with literal quotes.
Adds avatar_url column to workspace, threads it through the API +
WorkspaceAvatar component, and adds a click-to-upload editor in the
workspace settings tab. Mirrors the squad avatar pattern (migration 086);
UI strings use "logo" while the schema/code uses avatar_url for codebase
consistency with user.avatar_url and squad.avatar_url.
- migration 093: ALTER TABLE workspace ADD COLUMN avatar_url TEXT
- UpdateWorkspace SQL + handler accept avatar_url (auth gated to
owner/admin at the router via RequireWorkspaceRoleFromURL)
- WorkspaceAvatar renders <img> when avatar_url is set, falls back to
the initial-letter span otherwise
- workspace-tab.tsx adds a 16x16 click-to-upload logo editor at the
top of the general settings card, using useFileUpload + accept=
image/png,image/jpeg,image/webp (server stores under workspaces/{id}/)
- en + zh-Hans settings i18n strings added
Co-authored-by: Matt Voska <voska@users.noreply.github.com>
CancelTaskByUser (POST /api/tasks/{taskId}/cancel) keyed cancellation off
issue_id / chat_session_id alone, so any task whose only source link was
autopilot_run_id (run_only autopilots) or quick_create context fell into the
dead else branch and 404'd with "task not found" — even though the task was
visible (and showed a cancel X) on the agent Activity tab.
Enforce tenancy uniformly through the task's owning agent instead: agent_id is
NOT NULL on every task row (ON DELETE CASCADE), and agents are workspace-scoped,
so GetAgentTaskInWorkspace (task JOIN agent ON workspace) is a single tenant
guard that works regardless of which optional source FK is set — including
orphan tasks whose autopilot_run_id was SET NULL after the autopilot was
deleted. Privacy layers on top: chat tasks stay creator-only, and every other
task mirrors the agent Activity / snapshot private-agent visibility gate via
canAccessPrivateAgent so the id-only endpoint is never more permissive than the
surface that exposes the task.
Tests cover run_only (same-ws success, cross-ws 404 no-mutation), quick_create,
retry clones, issue-task regression, chat non-creator 403, and private-agent
plain-member 403.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
The Go SKILL.md frontmatter parser unmarshalled into a {Name,Description}
string struct, so a non-scalar value (a list/map written where a scalar
belongs) made the whole decode fail and dropped even a valid sibling
`name`. The TS parser instead kept the name and JSON-encoded the value,
so the file-viewer (TS) and the import path (Go) could disagree about
the same SKILL.md.
Decode into a generic map and coerce per key on the Go side, mirroring
the TS coercion (scalars -> literal form, sequences/mappings -> JSON), so
both sides produce identical results and a structured value never
discards a sibling key. Rename ParseFrontmatter -> ParseSkillFrontmatter
to remove the cross-language name clash with the TS parseFrontmatter
(which returns {frontmatter, body}), and drop the unused TS
parseSkillFrontmatter export.
Add parity tests for sequence/mapping values plus name-only,
description-only, leading-blank-line and triple-dash-in-body edge cases
on both sides.
Follow-up to #3543 / MUL-2842.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Three independent line-based frontmatter parsers only handled
single-line `description: value`, so a YAML block scalar
(`description: |`) collapsed to the literal "|" and the rest of the
description was dropped before it ever reached the database.
Replace all three with real YAML decoders that understand block
scalars, folded scalars and quoted values:
- server/internal/skill: shared ParseFrontmatter via gopkg.in/yaml.v3,
used by both the handler import path and daemon local-skill discovery
- packages/core/skills: shared parseFrontmatter via the yaml package
- file-viewer renders multi-line frontmatter values (whitespace-pre-wrap)
Both parsers fall back to empty values on malformed YAML, preserving the
previous non-fatal behaviour.
Add 'social_github' as a new attribution source option in the
onboarding 'How did you hear about Multica?' multi-select picker,
alongside the existing X / LinkedIn / YouTube options.
Includes:
- New 'social_github' value in the Source type union
- New GitHubIcon in the brand-icons component
- New option in step-source.tsx (placed next to other social picks)
- en/zh-Hans/ko i18n labels
Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
The claude backend wrote the full prompt to the child's stdin and closed
it before starting the stdout reader goroutine. With
--verbose --output-format stream-json the CLI emits a startup banner
before reading its first stdin frame; with no reader draining stdout, the
child blocks on its stdout write, never reads stdin, and our stdin Write
blocks until the per-task context fires. The field symptom is tasks
failing exactly at the 2 h per-task timeout with
"write |1: The pipe has been ended."
Move writeClaudeInput into its own goroutine so the prompt write and the
stdout drain proceed concurrently. Guard stdin close with sync.Once (it
can now be called from both the writer goroutine and, previously, the
result handler). Join the write result at cmd.Wait() and surface a write
failure as a "failed" status only when no result event arrived and no
session was established, so a genuine startup death still reports the
stderr tail.
Add a regression test that re-execs the test binary as a fake claude
which bursts 256 KiB to stdout before reading stdin, with a 128 KiB
prompt pushed at stdin — both past any plausible OS pipe buffer — so a
regression hangs until the test deadline instead of passing.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* feat(agents): add runtime machine filter to Agents tab (MUL-2846)
Add a dropdown filter to the Agents tab toolbar that lets the user
narrow the list to agents bound to a specific runtime machine. The
filter reuses `buildRuntimeMachines` from the runtimes package so the
machine grouping (Local / Remote / Cloud) matches the Runtimes page
sidebar, and the per-machine agent counts respect the current scope
(Mine/All) so the numbers reflect what the user would see if they
clicked the row.
Only rendered in the Active view; the Archived view's toolbar is
unchanged. If the selected machine is GC'd while the user is on the
page (daemon stopped, runtime deleted), the filter auto-resets to
'All runtimes' instead of leaving the list empty. The no-matches state
now surfaces 'No agents on <machine>' when the machine filter is the
reason for zero results.
Adds new `runtime_filter` and `no_matches.runtime_filtered` /
`no_matches.search_runtime_filtered` i18n keys in en, zh-Hans, and
ko. 7 new unit tests in
`runtime-machine-filter-dropdown.test.tsx`.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agents): address code review on runtime machine filter
- Plumb localDaemonId / localMachineName / hasLocalMachine / currentUserId
through AgentsPage → buildRuntimeMachines so the Local section and
device-name consolidation match the Runtimes page on both web and
Desktop. Adds a DesktopAgentsPage wrapper that bridges daemonAPI the
same way DesktopRuntimesPage does.
- Make the 'All runtimes' badge use the in-scope total instead of
summing per-machine counts, so an agent bound to a GC'd runtime
doesn't silently vanish from the count.
- Move Date.now() out of the machines useMemo into a useState lazy
init so the snapshot stays stable per mount.
- Drop unused i18n keys (all_description / this_machine / reset) from
runtime_filter in en / zh-Hans / ko.
- Add a regression test for the All-runtimes badge divergence.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agents): machine-scoped availability counts + Base UI menu items
Follow-up to the previous code-review round (Emacs review at 1144b6023).
#1 (medium) — Availability counts now respect the selected machine.
Introduce an inScopeOnMachine memo (inScope narrowed by the selected
runtime machine, but NOT by availability chip or search) and use it as
the base for both availabilityCounts and the AvailabilityFilterRow's
totalCount, so the chips reflect 'agents on this machine' once a
machine is selected. filteredAgents is now derived from inScopeOnMachine
so the availability chip and search further refine within the machine
scope. The dropdown's 'All runtimes' badge still uses inScope.length —
it's the count the user would see if they cleared the filter, so it
should stay unfiltered.
#2 (low) — Dropdown rows now use DropdownMenuItem instead of raw <button>.
Replaces the bare <button> in RuntimeMachineFilterItem with the
shared DropdownMenuItem wrapper (Base UI Menu.Item). The rows are now
registered as proper menu items: keyboard navigation (arrow keys, Enter,
Space), typeahead, ARIA role='menuitem' semantics, and auto-close on
selection (closeOnClick: true) all work. Active styling is preserved
via data-active, and a data-highlighted variant on the inactive style
matches Base UI's keyboard-focus appearance.
Tests updated to use role-based queries (getByRole('menuitem')) and
add a regression that verifies the menu is properly registered with
Base UI.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: MiniMax M3 <M3@multica.local>
* feat(email): support implicit TLS (SMTPS/465) for SMTP relay
The SMTP relay previously only did opportunistic STARTTLS: it dialed
plaintext and upgraded if the server advertised STARTTLS. Providers that
only offer implicit TLS on port 465 and do not advertise STARTTLS (e.g.
Aliyun enterprise mail) could not be used as a relay at all.
Add an SMTP_TLS env var:
- unset / starttls (default): unchanged STARTTLS-upgrade behavior.
- implicit / smtps / ssl: dial with tls.DialWithDialer (SMTPS).
Implicit TLS is auto-enabled when SMTP_PORT=465 and SMTP_TLS is unset, so
the common case works with no extra config. The startup log line now
reports the negotiated mode (starttls / implicit-tls).
Co-authored-by: multica-agent <github@multica.ai>
* feat(email): plumb SMTP_TLS through selfhost compose, warn on unknown values
The backend reads SMTP_TLS but docker-compose.selfhost.yml never forwarded
it, so SMTP_TLS=implicit on a non-standard port (or an explicit starttls
override on 465) silently did nothing inside the container. Add it to the
backend.environment block.
Also log a one-line warning when SMTP_TLS is set to an unrecognized value
(e.g. "tls"/"true"/"on"), which would otherwise fall through to STARTTLS
and fail to dial a 465 SMTPS port with no startup hint.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* test(email): cover SMTP_TLS precedence and alias resolution
Table-driven test over NewEmailService asserting the implicit-TLS decision:
465 auto-enables implicit; explicit starttls on 465 overrides auto-detect;
implicit/smtps/ssl aliases (case-insensitive, whitespace-trimmed) force SMTPS
on any port; unknown values fall back to starttls.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* docs: document SMTPS / SMTP_TLS support, drop "465 unsupported"
Port 465 implicit TLS is now supported, so the five places that said it was
unsupported are wrong. Replace those sentences, add an SMTP_TLS row to the
environment-variables tables (EN + ZH), and add a copy-pasteable SMTPS env
block to the auth-setup pages.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: guofengchang <guofengchang@cumulon.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Expose self-host daemon setup URLs from /api/config at runtime so the Add computer dialog renders the operator's own server/app domains, while Multica Cloud defaults stay unchanged.
Fixes#3013.
Closes the runtime-side gap of #2106: previously `agent.mcp_config` was
honored only by Claude Code (via `--mcp-config <file>`); for OpenCode the
field was accepted by the API but silently ignored at execution time.
## Approach
OpenCode has no `--mcp-config` flag. Project the agent's `mcp_config`
into OpenCode via OPENCODE_CONFIG_CONTENT — OpenCode's general
inline-config injection environment variable, which accepts any subset
of OpenCode's config schema (model / agent / mode / plugin / mcp / …)
and merges at "local" scope after the project-config loop. MCP is the
only field this PR projects through that channel; if a future Multica
field needs the same channel it would assemble a combined config slice
before the env append.
The env-var route was deliberate. An earlier draft of this PR wrote
the translated MCP servers into <workdir>/opencode.json and removed
the file on cleanup; review (#3098) flagged that the task workdir is
reused across turns for the same (agent, issue), and any agent- or
user-written model / tools / permission settings in opencode.json
must survive across runs. OPENCODE_CONFIG_CONTENT avoids the workdir
entirely — nothing is written to disk, no cleanup is needed, and the
env entry dies with the spawned process.
OPENCODE_CONFIG_CONTENT was added to OpenCode in v1.4.10 (2025-09); the
official @opencode-ai/sdk uses the same env var to inject runtime
config, so the surface is stable. Verified empirically against
OpenCode 1.15.6 in our K8s runtime: `opencode debug config` returns
the injected mcp slice deep-merged with the user's global config,
and <workdir>/opencode.json is observably untouched.
## Translation surface
`agent.mcp_config` accepts two shapes for portability:
- Claude-style `{"mcpServers": {name: {url|command, ...}}}` is
translated into OpenCode's native form: `type: "local"|"remote"`,
`command` coerced to a string array, `env` renamed to `environment`.
- Native OpenCode `{"mcp": {name: ...}}` accepts the three shapes
OpenCode's schema permits and is strict-decoded against each:
- McpLocalConfig: `{type:"local", command:[…], environment?, enabled?, timeout?}`
- McpRemoteConfig: `{type:"remote", url:"…", headers?, oauth?, enabled?, timeout?}`
- bare override: `{enabled: bool}` (toggle a server inherited
from global / project config without redefining it)
Decoding uses `json.DisallowUnknownFields` so any field outside the
matching schema is rejected — matching OpenCode's
`additionalProperties: false`. Without this, a malformed payload
(e.g. `command: "node"` instead of `command: ["node"]`) would reach
OpenCode verbatim and either silently disable the server or crash
the CLI at startup.
Field-level checks the strict decoder doesn't catch:
- `timeout` must be a positive integer (rejects 0, negative, fractional)
- `oauth` must be either an object (validated against McpOAuthConfig)
or the literal `false`; primitives and `true` are rejected as ambiguous
- `oauth.callbackPort` must be in 1..65535 when set
## Precedence
Go's os/exec dedups `cmd.Env` by key keeping the LAST occurrence
(Go 1.9+). Appending OPENCODE_CONFIG_CONTENT after `buildEnv(b.cfg.Env)`
guarantees the daemon's value wins over any value the user happened
to put in `agent.custom_env` — which matches the intended semantics
(`mcp_config` is the authoritative daemon-managed field; `custom_env`
is the escape hatch). When that override happens we surface a warning
log so accidental clobbers are debuggable.
## Limitation (out of scope, accepted in review)
OpenCode also deep-merges its **global** config
(`~/.config/opencode/opencode.json`) into every session and exposes no
flag to disable that. Operators who want strict per-agent isolation
from the global layer can set:
```jsonc
// agent.custom_env on the platform
{ "XDG_CONFIG_HOME": "/tmp/opencode-isolated" }
```
…pointing at any directory without an `opencode/` subdir. OpenCode then
reads no global config and only honors what the daemon injects via
OPENCODE_CONFIG_CONTENT. Verified with `opencode debug config`.
## Changes
server/pkg/agent/opencode_mcp.go (new):
- buildOpenCodeMCPConfigContent — translates raw mcp_config into the
JSON string OpenCode accepts via OPENCODE_CONFIG_CONTENT, returns
"" when there's nothing to inject so the caller can skip the env
entry (avoids clobbering anything the user put in
agent.custom_env.OPENCODE_CONFIG_CONTENT)
- translateMCPConfigForOpenCode + helpers — Claude-style → OpenCode
native shape
- validateOpenCodeNativeMCPEntry + opencodeMCPLocal /
opencodeMCPRemote / opencodeMCPEnabledOnly / opencodeMCPOAuth
typed structs — strict-decode native-shape entries against the
schema (DisallowUnknownFields), plus targeted post-decode
assertions for timeout / oauth / callbackPort
server/pkg/agent/opencode.go:
- 12 lines of env injection in Execute(), placed AFTER buildEnv so
the daemon's value wins via os/exec dedup
- warning log when agent.custom_env duplicates the same key
- no on-disk state, no rollback closure, no post-run cleanup —
OPENCODE_CONFIG_CONTENT lives only in the spawned process env
server/pkg/agent/opencode_mcp_test.go (new):
- TestBuildOpenCodeMCPConfigContent_{Empty,Remote,Local,Native}
- TestBuildOpenCodeMCPConfigContent_NativeAcceptsAllSchemaFields —
covers each native variant round-tripping every optional field
(local with env+timeout+enabled; remote with headers+oauth-object+
timeout+enabled; remote with oauth: false; bare {enabled} override)
- TestBuildOpenCodeMCPConfigContent_RejectsMalformedNative — 31-case
table covering every constraint on Bohan-J's review: command must
be a string array, environment / headers values must be strings,
oauth must be an object or false, timeout must be a positive
integer, additionalProperties: false (per-shape allow-list checked
via DisallowUnknownFields)
- TestOpencodeBackendInjectsMCPConfigViaEnv — E2E happy path; fake
opencode binary captures $OPENCODE_CONFIG_CONTENT, asserts the
translated mcp slice is present AND <workdir>/opencode.json was
NOT written
- TestOpencodeBackendOmitsMCPEnvWhenEmpty — empty mcp_config does
NOT inject the env, preserving any value the user set in
agent.custom_env
- TestOpencodeBackendOverridesUserOpenCodeConfigContent — daemon
value wins via os/exec dedup keep-last
apps/docs/content/docs/providers.{en,zh}.mdx:
- flip OpenCode's MCP cell from ❌ to ✅
- reword the "MCP configuration: only Claude Code actually reads it"
section so OpenCode is included; describe each tool's mechanism
(Claude → `--mcp-config`, OpenCode → OPENCODE_CONFIG_CONTENT)
apps/docs/content/docs/install-agent-runtime.{en,zh}.mdx:
- update the Claude Code blurb (no longer "the only one")
- expand the OpenCode blurb to mention mcp_config support
- fix the now-broken /providers anchor
Refs #2106 (TS types and per-agent UI for mcp_config are separate
follow-ups, not in this PR).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#3509/#3523 scoped the comment-trigger since-delta count to the triggering
thread, so an agent resuming a busy issue only saw "+N in this thread" and
lost visibility of new comments in other threads. Revert the count to
issue-wide (every thread), keeping the trigger-comment + agent-own
exclusions, and reshape the warm-path hint to:
- report the issue-wide new-comment volume,
- steer the agent to read the triggering (parent) thread FIRST
(`--thread <trigger> --since`, or `--tail 30` for full context),
- demote the issue-wide `--since` catch-up to an only-if-needed fallback
("don't read them all blindly").
Also fixes the now-stale "scoped to the triggering thread" wording in the
resumed-session no-delta hint (it's issue-wide zero now).
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
PR #3517 moved the AgentLiveCard out of the activity section to a
sticky bar above the editable title. This restores its original
placement (after LocalDirectoryHint, above the timeline) and reverts
the container margin from mb-4 back to mt-4 that suited the top slot.
Everything else from #3517 is kept: the single-container multi-agent
accordion, the lifted cancel-confirmation dialog, and the dropped
parked/running background variant.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PR #2337 (Multica for iOS) accidentally added expo / react /
react-native to the monorepo ROOT package.json dependencies — almost
certainly a `pnpm add` run from the repo root instead of
`--filter @multica/mobile`. The root isn't an app and imports none of
them; apps/mobile already declares all three itself (react pinned to
19.2.0 for Expo SDK 55, per apps/mobile/CLAUDE.md).
The stray root react@19.2.0 hoisted to top-level node_modules/react,
producing a React version skew against the catalog's 19.2.3 that the
rest of the tree (web / desktop / ui / views) uses. After removal the
hoisted top-level react resolves to 19.2.3 and mobile keeps its own
19.2.0 untouched. Lockfile shrinks ~411 lines as the root importer no
longer pulls the Expo/RN transitive tree.
Pure dependency hygiene — no source changes, no behavior change for any
app. Mobile build unaffected (verified its package.json still declares
expo/react/react-native/react-dom).
* docs(i18n): translate documentation corpus to Korean
Add Korean (.ko.mdx) translations for all 32 navigable docs pages plus
meta.ko.json navigation, mirroring the English source. Product terms
(Issue→이슈, Agent→에이전트, Squad→스쿼드, Runtime→런타임, Skill→스킬,
Workspace→워크스페이스, etc.) follow the in-app Korean locale at
packages/views/locales/ko/. Roles (owner/admin/member) and issue status
enums stay lowercase English per the conventions glossary.
MUL-2817
Co-authored-by: multica-agent <github@multica.ai>
* feat(docs): serve Korean docs content, remove English-fallback stopgap
Now that the *.ko.mdx corpus exists, drop the temporary docsContentLang
ko→en shim and the static-params fallback-synthesis loop so /docs/ko/*
renders real Korean content. Korean is now a first-class locale whose
params come straight from source.generateParams(). Also align the docs
home hero copy (agent→에이전트) with the app and the translated body.
MUL-2817
Co-authored-by: multica-agent <github@multica.ai>
* docs(i18n): align residual Korean UI/product terms with the app
Address review: sweep the .ko.mdx corpus for product/UI terms left in
English and match the in-app Korean locale.
- skills page title Skills → 스킬
- UI nav paths localized: Settings → 설정, Runtimes → 런타임, Agents →
에이전트, Projects → 프로젝트, Squads/New squad → 스쿼드/새 스쿼드,
Usage → 사용량, Personal Access Tokens → API 토큰, Provider → 제공자,
and the agent-create form labels (Name/Provider/Model/Instructions →
이름/제공자/모델/지침)
- see-also links Issues/Workspaces/Environment variables and
'Providers Matrix' → Korean
- kept as literals (verified): code blocks, the conventions i18n glossary
data, 'Anthropic Agent Skills' (standard name), the Squad Operating
Protocol/Roster/Instructions prompt-block names, the literal 'Project
Context' prompt section, and Xcode's Settings path
- add a docsAlternates test asserting ko hreflang is emitted when a real
*.ko.mdx page exists
MUL-2817
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(cli/login): accept mcn_ Cloud Node PATs alongside mul_ (MUL-2815)
multica login --token rejected anything not starting with mul_, so
users with a Multica Cloud Node PAT (mcn_ prefix) hit
"invalid token format: must start with mul_" even though the server
middleware verifies both kinds.
Replace the inline literal check with validateLoginTokenPrefix(), backed
by a small loginTokenPrefixes list ({mul_, auth.CloudPATPrefix}) so the
accepted set has one source of truth. Add unit-test coverage so adding
a new prefix in future is an obvious one-line edit.
Co-authored-by: multica-agent <github@multica.ai>
* fix(cli/login): mention mcn_ Cloud Node PATs in --token help and comments
Follow-up to 47e423c4: the login command now accepts mcn_ tokens but the
help string and surrounding comments still only documented mul_, so a user
running 'multica login --help' couldn't tell that mcn_ was supported.
Update the --token help string and the cobra Args / NoOptDefVal comments
to list both mul_... and mcn_... prefixes.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
Move the per-issue "agent is working" bar out of the activity section to a
sticky bar at the top of the main content, above the editable title, and fix
the multi-agent layout.
- Single active task fills one bordered container as a directly-actionable
row (Logs + Stop inline).
- Multiple active tasks collapse into the same container: a summary header
(avatar stack + count) that expands the rows inline as a divided list,
instead of stacking N detached banners.
- Lift cancel confirmation to AgentLiveCard so one dialog serves both the lone
row and the expanded list (avoids a confirm dialog being torn down by an
enclosing popup).
- Drop the redundant parked/running background variant; running vs queued is
signalled by icon + label only.
- Reuse the existing agent_activity.hover_header plural for the summary; no new
i18n keys.
- Update tests for single-inline vs multi-accordion behavior.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(views): unify detail/list headers into shared BreadcrumbHeader
Replace four hand-rolled, divergent header styles (workspace-name root,
"/" separator, back-arrow, raw div) with one shared BreadcrumbHeader
component. The mental model is now identical everywhere: leading crumbs
are the thing's real containers and clicking one navigates up.
- New packages/views/layout/breadcrumb-header.tsx (segments/leaf/actions)
- Detail pages (issue, project, runtime, skill, autopilot, agent, squad)
now render `{Section} › name`; org name removed as a breadcrumb root
- Issue breadcrumb shows the single most-direct container only (parent
wins over project; they are orthogonal columns), never a fabricated
chain; bare issue shows just its title
- Issue leaf (identifier + title) is now a clickable link to the issue
detail page with a subtle hover:opacity-80
- Issues / My Issues list headers drop the workspace prefix, matching the
icon + title style of the other list pages
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(views): update breadcrumb tests for unified header behavior
The header unification changed three observable behaviors the tests
asserted against:
- issue detail no longer renders the workspace name as a breadcrumb root
- bare issue shows only its (now clickable) title leaf, no ancestor crumbs
- the project "Unknown project" error placeholder was removed
Rewrite the two affected issue-detail tests to assert the new leaf-link
and no-project-crumb behavior, drop the obsolete Unknown-project test, and
update the issues-page header test to assert the workspace prefix is gone.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(comments): roots-only thread stats + summary projection for comment list
Enrich the roots_only read so each root carries reply_count (recursive
descendant count) and last_activity_at (MAX created_at over the subtree),
letting an agent triage which thread to open without fetching any replies.
Add an orthogonal summary=true projection (--summary) that clips each
returned comment's content to a fixed budget and sets content_truncated,
so an agent can scan a list cheaply before pulling a full body. It composes
with every read mode (default, since, thread, recent, roots_only).
New response fields are optional (omitempty) and only populated for the
agent-facing query params, so the default response shape is unchanged for
the desktop/web and existing CLI callers.
Co-authored-by: multica-agent <github@multica.ai>
* test(comments): cover roots_only + summary composition end-to-end
The summary projection composing with roots_only is the spec's headline
"table of contents" read, but it was only exercised at the CLI param-
forwarding level — no handler test asserted that a roots_only response
both clips content AND keeps reply_count / last_activity_at. A refactor
moving the clip into a per-mode branch would silently break that
composition with no failing test.
Add TestListComments_RootsOnlySummaryComposes: a long root + a reply,
read via roots_only=true&summary=true, asserting the root is clipped
(content_truncated=true) while its subtree stats still surface.
Co-authored-by: multica-agent <github@multica.ai>
* refactor(comments): address review nits on roots stats + summary
- ListRootComments[Since]ForIssue: scope the recursive membership walk to a
selected_roots CTE (the @row_limit page, with the @since cut applied up front)
so stats are only computed over the subtrees of the roots actually returned,
instead of every thread in the issue.
- summarizeContent: scan by rune and stop at the budget+1th rune instead of
allocating a full []rune for the whole body, so a pathologically long comment
costs only the budget under summary mode. Add a multi-byte (CJK) test to lock
rune-boundary clipping.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Remove the agent-path self-trigger guard in triggerChildDoneAgent so a child going done wakes its parent agent even when the same agent owns both — a serial sub-task handoff across two different issues, not a loop. Runaway re-triggering stays bounded by HasPendingTaskForIssueAndAgent. Squad path unchanged. Closes#3374.
* fix(runtimes): consolidate out-of-band local daemon by host name
The desktop derived `localMachineName`/`localDaemonId` only from the daemon
it manages itself. When the real daemon runs out-of-band (e.g. in WSL2 on a
Windows host), the app never gets a device name, so the #3336 device-name
consolidation short-circuits on `!!localMachineName`: the local-mode runtime
falls into REMOTE and an empty "This machine" placeholder is synthesized.
Fix in two parts:
- Desktop exposes the host OS hostname (`os.hostname()`) via a new
`daemon:get-host-name` IPC and uses it as the final fallback for
`localMachineName`, independent of daemon state.
- Scope device-name consolidation to the current user's own local runtimes
(`owner_id === currentUserId`). The runtime list is workspace-wide, so a
host-name match alone could otherwise claim another member's identically
named machine as "this machine".
Once the real runtime classifies as current it moves to LOCAL and the empty
placeholder is no longer synthesized.
MUL-2799
Co-authored-by: multica-agent <github@multica.ai>
* fix(runtimes): use OS-neutral "This device" label for the current machine
The current-machine badge was hard-coded to "This Mac" (en), so it rendered
incorrectly on Windows/Linux hosts. zh-Hans already used the neutral "本机".
MUL-2799
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(comments): since-delta new-comment hint + default-on comment session resume (#3432)
* feat(db): add unresolved comment count + list filter queries
Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(handler): support unresolved-only comment listing
Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(handler): plumb unresolved count + thread root into claim, gate comment resume
Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.
Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(daemon): inject unresolved-comments hint + resolve step into agent brief
Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.
Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cli): add comment list --unresolved and comment resolve command
Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(comments): since-delta new-comment hint + default-on comment resume
Simplifies the comment-triggered agent flow down to what's actually needed:
- New-comment awareness is now a pure time delta: the claim response carries
new_comment_count + new_comments_since (anchored on the prior run's
started_at, never completed_at so a long run can't miss comments). The
per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
comment" prompt guard defends against inheriting the prior turn's "Done."
marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
/ ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
--thread/--recent/--since flags stay in the CLI/API, just no longer explained
inline every turn.
Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(comments): collapse reply parent_id to thread root on write
Comment threads are a 2-level model (root + flat replies, like Linear/Slack),
enforced today only by the UI and the agent path — the CreateComment handler
stored whatever parent_id it was handed, and the agent-side flatten walked just
one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a
recursive walk to the parent_id=NULL root) and run both write paths
(handler.CreateComment, service.createAgentComment) through it, so every stored
reply's parent_id IS its thread root. Readers can now treat parent_id as the
thread root without re-walking. The agent-drift guard still compares the raw
parent_id to the trigger comment before normalization.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(comments): cold-start reads triggering thread, warm keeps --thread pointer
The since-delta rework dropped the thread-first read on the COLD path: a
first-time agent fell back to the flat `comment list` dump (oldest-first, cap
2000), burying the trigger's context in ancient chatter. Point cold start at the
triggering conversation instead via a shared BuildColdCommentsHint
(`--thread <trigger> --tail 30` + a --recent pointer for cross-thread
background). On the WARM path, --since is a pure time delta and can miss the
triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a
--thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via
the shared helpers so they cannot drift (PR #2816 rule).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude Code now ships Opus 4.8 (claude-opus-4-8). Add it to the three
places that enumerate Claude models so the picker, thinking-level
catalog, and usage cost estimates all recognize it:
- claudeStaticModels(): list Claude Opus 4.8 (Sonnet 4.6 stays default)
- claudeModelEffortAllow: Opus supports the full low..max set incl. xhigh
- MODEL_PRICING: $5/$25 in, $0.50 cache read, $6.25 5m cache write —
same current-gen Opus tier as 4.5/4.6/4.7, confirmed against
platform.claude.com/docs/en/about-claude/pricing
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2792 fix(agent): preserve skills in update/archive/restore response (#3459)
agentToResponse always initialises Skills as []; the mutation handlers
relied on the caller to refresh it, but only GetAgent and ListAgents
actually did. UpdateAgent / ArchiveAgent / RestoreAgent therefore
returned "skills": [] regardless of what the agent_skill junction table
contained.
The DB write path was never wrong — skills weren't actually deleted —
but the misleading response (and its matching agent:status / archived /
restored WS broadcast) scared users into manually re-running
`agent skills set` and risked scripted clients writing the empty set
back as truth.
Extract the existing GetAgent skill-reload block into attachAgentSkills
and call it from the three buggy handlers. Add regression tests that
attach skills, hit each mutation endpoint, and assert both the response
and the junction table.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): attach skills before env/template broadcasts (#3459)
Two follow-up sites flagged in PR #3464 review that shared the same
"agentToResponse zeroes Skills, callers forget to reload" pattern as the
mutation handlers:
- agent_env.go: the agent:status broadcast after UpdateAgentEnv used a
bare agentToResponse, so subscribers saw skills wiped on every env
rotation. HTTP body is AgentEnvResponse so the response itself is
unaffected, but the WS event still misleads any cache that ingests it.
- agent_template.go: CreateAgentFromTemplate attaches imported and extra
skills inside the tx, then builds the response/agent:created broadcast
without reloading them — so callers (and any client tracking the
create event) see the freshly created agent as skill-less despite the
template having just imported them.
Both call sites now reuse attachAgentSkills introduced for UpdateAgent.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Extracts CODE_LIGATURE_CLASS and CODE_LIGATURE_DESCENDANT_CLASS into
packages/ui/lib/code-style.ts. Non-markdown CLI command surfaces
(onboarding/cli-install-instructions, runtimes/connect-remote-dialog)
can now import the class strings without pulling in the shiki +
react-markdown + katex dependency graph via the markdown barrel.
CodeBlock and Markdown continue to consume the constants from the
new module; the markdown barrel no longer re-exports CODE_LIGATURE_CLASS.
MUL-2793
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime
The MCP config tab (#3419) lets admins save mcp_config on an agent, and
recent work (#3439) plumbed it through the three ACP runtimes. OpenClaw
still ignored the field, leaving the Tab silently inert for any
OpenClaw-backed agent.
Translate the agent's Claude-style `{"mcpServers": {...}}` into the
per-task OpenClaw wrapper's `mcp.servers` block — OpenClaw resolves MCP
via its own config schema rather than ExecOptions, so the existing
OPENCLAW_CONFIG_PATH preparer is the right seam. Fail closed on
malformed JSON / entries missing `command` or `url`, matching the
fail-closed posture the preparer already uses for the agents.list step.
Null / absent mcp_config leaves the wrapper free of an `mcp` key so the
user's global mcp.servers flows through untouched; an explicit empty
managed set (`{}` / `{"mcpServers":{}}`) is honoured as "admin saved no
servers" mirroring `hasManagedCodexMcpConfig`.
Strict-mode replacement (drop user-only servers entirely) would require
OpenClaw to do a per-key replace rather than a deep merge at
`mcp.servers`; the comment documents that caveat rather than relying on
undocumented behaviour.
Also adds `openclaw` to `MCP_SUPPORTED_PROVIDERS` so the MCP Tab
actually surfaces in the agent overview pane, and pins the new
visibility case with a renderPane test.
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2778 fix(agent): make openclaw mcp_config strict-replace via sanitized snapshot
Elon flagged on #3450 that the previous wiring let user-only mcp.servers
leak through the wrapper's `$include` of the live user config: deep-merge
at `mcp.servers` keeps user-only names, and the strict-empty case
(`{ "mcpServers": {} }`) silently inherited user globals.
Switch the strict-replace path to write a sanitized snapshot of the
user's fully resolved config (via `openclaw config get --json`) with the
`mcp` block stripped, then have the wrapper `$include` the snapshot
instead of the live user file. With the user's `mcp` gone from the
$include resolution, the wrapper's `mcp.servers` is the only definition
the embedded OpenClaw sees — managed only, including the explicit empty
set.
The snapshot lives in envRoot at 0o600 alongside the wrapper so the GC
reaper sweeps it with the rest of the task scratch, and no extra
OPENCLAW_INCLUDE_ROOTS entry is needed (same-dir $include).
Fail-closed on `config get --json` errors so the daemon never silently
falls back to the leaky $include path. The inherit branch (null
mcp_config) still uses the live user file directly — no extra CLI
roundtrip and no snapshot is written.
New tests pin the contract Elon's review required:
- TestPrepareOpenclawConfigStrictReplacesUserMcpServers: user has
global_one + shared, managed has shared + managed_only → wrapper has
exactly {shared (managed value), managed_only}; global_one does NOT
leak; snapshot file has the user's `mcp` stripped while preserving
gateway / providers / API keys.
- TestPrepareOpenclawConfigStrictEmptyManagedSetDropsUserMcp: empty
managed set drops user's global_one (both `{}` and
`{"mcpServers":{}}` cases).
- TestPrepareOpenclawConfigNullMcpConfigKeepsUserInclude: null path
inherits the live user config, writes no snapshot, makes no extra CLI
call.
- TestPrepareOpenclawConfigFailsClosedOnResolvedConfigError: errors
during `config get --json` surface; no stale wrapper or snapshot.
- TestPrepareOpenclawConfigManagedSetFreshInstall: fresh install with
managed mcp_config skips the snapshot dance entirely.
Also tightens en + zh-Hans MCP Tab copy to mention OpenClaw goes via the
per-task wrapper, and to use OpenClaw's own `transport` field rather
than Claude's `type` for HTTP/SSE entries.
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2778 fix(agent): narrow openclaw snapshot strip to mcp.servers only
Elon's third-round must-fix: the previous strict-replace snapshot deleted
the entire `mcp` block, which wiped out non-server settings under `mcp`
like `sessionIdleTtlMs`. Those are documented OpenClaw config keys
(https://docs.openclaw.ai/gateway/configuration-reference#mcp) outside
the MCP Tab's scope — the agent's saved mcp_config only manages server
definitions, so other `mcp.*` tuning the user set must survive.
Replace the blanket `delete(resolved, "mcp")` with a stripUserMcpServers
helper that:
- deletes only `mcp.servers` when `mcp` is an object
- drops the parent `mcp` key only when the object is empty after the
strip (so we don't emit `mcp: {}` placeholders)
- leaves non-object `mcp` values untouched (we only know how to strip
servers from the documented shape)
Pinned with TestPrepareOpenclawConfigStrictPreservesNonServerMcpKeys:
user resolved has both `mcp.sessionIdleTtlMs: 300000` and
`mcp.servers.global_one`; after the strict path runs the snapshot
keeps the TTL and drops the servers map, and the wrapper's
`mcp.servers` is exactly the managed set with no leak.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
The previous fix (#3446) for the i18next/no-literal-string CI failure
took the lazy route — added a file-level eslint-disable comment with
the rationale that the test page is slated for deletion when the real
billing UI ships, so investing in a translation namespace was wasted
churn.
That argument is weak in practice:
* The test page has been around since #3442 and there is no
concrete date for when the real billing UI lands. "Throwaway"
surfaces tend to outlive their stated half-life.
* File-level disables hide future violations of the same rule from
review — anyone touching this file later inherits an opt-out
they didn't ask for.
* The translation namespace is genuinely cheap. The page has ~50
distinct strings, all of which now have a single en.json/zh.json
home that the locale-parity test guards.
This commit replaces the silencer with a real `billing` namespace:
* packages/views/locales/en/billing.json + zh-Hans/billing.json —
every literal English label from the page, plus interpolated
sentences with i18next `{{var}}` placeholders. The zh-Hans
bundle is a basic translation; not perfect prose, but the parity
test passes and a Chinese reviewer testing the page won't see
raw English mixed with native UI chrome.
* packages/views/i18n/resources-types.ts — adds the namespace to
`I18nResources` so `t($ => $.billing.x.y)` is type-checked.
* packages/views/locales/index.ts — registers both bundles in
`RESOURCES` so the parity test sees them.
* packages/views/billing/billing-test-page.tsx — every JSX literal
rewritten as `t(($) => $.path)`, including aria-label,
interpolated sentences (balance meta line, transactions row meta,
paging footer), and conditional branches (with-bonus vs
without-bonus rendering goes through two distinct keys rather
than two literals). The file-level eslint-disable is removed
along with its multi-paragraph rationale comment.
* formatDate now takes `t` as an argument so the en/zh "—" dash
placeholder is itself translated; calling useT inside the util
would have violated the rule of hooks (the function is called
from conditional render branches).
Verification:
* pnpm --filter @multica/views lint — 0 errors (15 unrelated
warnings, all pre-existing on main)
* pnpm --filter @multica/views typecheck — clean
* pnpm --filter @multica/views test — 883/883 passing,
including the locale-parity test (which fails the build if
en and zh-Hans diverge)
When the real billing UI ships and this file is deleted, the
`billing` namespace JSONs and the `resources-types.ts` /
`locales/index.ts` entries get deleted with it — same blast radius
as the eslint-disable would have had, but with proper i18n along
the way.
* fix(daemon): cleanup .agent_context / .multica / provider skill sidecars after local_directory tasks (MUL-2784)
PR #3438 (MUL-2753) only restored CLAUDE.md / AGENTS.md / GEMINI.md to
their pre-task bytes; the sidecar tree writeContextFiles seeds
(.agent_context/, .multica/, .claude/skills/, .github/skills/,
.opencode/skills/, skills/, .pi/skills/, .cursor/skills/,
.kimi/skills/, .kiro/skills/, .agents/skills/, fallback
.agent_context/skills/) was explicitly deferred to this follow-up. In
local_directory mode the agent's workdir is the user's repo, so each
task accumulates one more layer of those directories in the user's
tree.
Plan A: track every file/dir Prepare creates inside workDir in a
sidecarManifest written to envRoot/.multica_sidecar_manifest.json
(daemon scratch — never in the user's workdir). On local_directory
teardown CleanupSidecars walks the manifest, removes the recorded
files, then rmdir-iterates the recorded directories in reverse.
Pre-existing files and directories are deliberately NOT recorded, so
a user-installed .claude/skills/my-own-skill/ sibling — or any
unrelated file the user keeps under .claude/, .github/, etc. — is
preserved bit-for-bit. Non-empty rmdir fails ENOTEMPTY and is
silently skipped, which is the signal that the user owns the
directory.
Daemon wiring lives next to the existing CleanupRuntimeConfig defer
in runTask: runtime brief first, sidecars second. Cloud-mode runs
still write a manifest for symmetry but never trigger the cleanup
(the GC loop wipes envRoot wholesale).
Tests (sidecar_manifest_test.go) cover the round-trip invariant per
the issue's acceptance criteria:
- empty workdir → Prepare → Cleanup → empty workdir, byte-exact, for
every file-based provider (claude, codex, copilot, opencode,
openclaw, hermes, pi, cursor, kimi, kiro, antigravity, gemini),
- user's .claude/skills/my-own-skill/ (and equivalents per
provider) survives Cleanup intact,
- unrelated user files under .claude/, .github/, etc. survive,
- three repeated cycles do not accumulate any orphan state,
- project_resources branch (.multica/project/resources.json) is
also reversible,
- recordWriteFile refuses to record pre-existing files,
- recordMkdirAll refuses to record pre-existing dirs,
- Cleanup is a no-op when the manifest file is missing.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): refuse to overwrite pre-existing sidecar paths; pick collision-free skill slugs (MUL-2784 review)
Addresses PR #3444 review (Elon):
**Must-fix #1**: recordWriteFile used to overwrite pre-existing target
files unconditionally and only skip the manifest record. That destroys
user bytes at write time AND leaves the corrupted contents in place at
cleanup time — the byte-exact contract the issue requires is violated
on both halves. Fixed by making recordWriteFile detect any pre-existing
entry (regular file, symlink, directory) via Lstat and return a
sentinel errPathPreExists without touching the path. The user's bytes
are preserved verbatim.
For per-skill collisions (user's .claude/skills/issue-review/ vs
Multica's "Issue Review"), writeSkillFiles now allocates a
collision-free sibling slug via allocateCollisionFreeSkillDir: first
attempt is the natural slug, then `<base>-multica`,
`<base>-multica-2`, …, bounded at 64 attempts. Provider-native
discovery still picks the skill up (every subdir under skillsParent is
a distinct skill) and the user's path stays bit-for-bit intact.
For Multica-only namespace files (.agent_context/issue_context.md,
.multica/project/resources.json), the writer swallows errPathPreExists
and continues — the runtime brief already carries every fact those
files would, so a collision degrades to brief-only mode rather than
destroying user content.
**Must-fix #2**: Added byte-exact collision matrix tests covering
every file-based provider (claude / codex / copilot / opencode /
openclaw / hermes / pi / cursor / kimi / kiro / antigravity / gemini):
- TestPrepareThenCleanupSidecarsSameSlugCollisionPerProvider: seeds
user's `<provider>/skills/issue-review/SKILL.md` plus a private
notes.md sibling, runs Prepare → Inject → Cleanup, asserts
workdir snapshot is byte-identical to seed.
- TestPrepareThenCleanupSidecarsIssueContextCollisionPerProvider:
seeds user's `.agent_context/issue_context.md`, asserts round-trip
preserves it.
- TestPrepareThenCleanupSidecarsProjectResourcesCollisionPerProvider:
same for `.multica/project/resources.json`.
- TestPrepareThenCleanupSidecarsMultiSkillCollisionFreeAllocation:
end-to-end check that the Multica skill lands at the
collision-free sibling and Cleanup removes only the Multica side.
- TestAllocateCollisionFreeSkillDir: directed unit test pinning the
slug-bumping sequence.
- TestRecordWriteFileRefusesToOverwritePreExistingFile (was
TestRecordWriteFileSkipsPreExistingFile): flipped to assert the
user's bytes survive and errPathPreExists is returned.
- TestRecordWriteFileRefusesToOverwriteSymlinkOrDir: covers the
Lstat path for non-file entries.
**Should-fix**: CleanupSidecars used to swallow ANY non-ENOENT rmdir
error as "user content present," silently dropping real I/O failures
(EACCES, EPERM, EBUSY). Now it re-reads the directory after a failed
rmdir via the new dirHasEntries helper — non-empty → silently skip
(ENOTEMPTY, the intended branch); empty → genuine error, captured
into firstErr and surfaced. Plus directed tests:
- TestCleanupSidecarsSurfacesRealRmdirErrors
- TestDirHasEntries
Local verification:
- go test ./internal/daemon/execenv/... — all green
- go test ./internal/daemon/... — all green
- go vet ./... — clean
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): surface original rmdir error when post-rmdir ReadDir also fails (MUL-2784 review)
Addresses remaining PR #3444 review blocker (Elon): dirHasEntries used
to return true when ReadDir failed with anything other than ENOENT,
which made CleanupSidecars treat every locked / faulted directory as
ENOTEMPTY and silently drop the original rmdir error. The v1 fix from
the previous round closed the EACCES-on-empty-dir branch but missed
the case where the chmod also blocks ReadDir — exactly the failure
mode the review called out.
Helper change: dirHasEntries now returns (hasEntries, ok bool):
- (false, true) — dir exists and is empty (or missing, race-safe)
- (true, true) — dir has user content (the ENOTEMPTY branch)
- (_, false) — ReadDir failed (EACCES, ENOTDIR, EIO, …); the
caller cannot tell ENOTEMPTY from a real error
and MUST surface the original rmdir error
CleanupSidecars switches on (ok, hasEntries):
- !ok → surface the ORIGINAL rmdir error (not the
ReadDir failure — that's diagnostic plumbing
and would distract from the root cause)
- ok && hasEntries → swallow silently (intended ENOTEMPTY branch;
preserve user content)
- ok && !hasEntries → surface the rmdir error (empty dir + EACCES /
EPERM / EBUSY → genuine cleanup failure)
Tests:
- TestDirHasEntries: extended with a regular-file sub-case (ReadDir
returns ENOTDIR) asserting (false, false). The v1 helper returned
(true) here, hiding the bug.
- TestCleanupSidecarsSwallowsMissingAndNonEmptyDirs: renamed from
TestCleanupSidecarsSurfacesRealRmdirErrors. The old name claimed
to test the surfacing path but never actually exercised it.
- TestCleanupSidecarsSurfacesEACCESOnEmptyRecordedDir: chmod parent
to 0o555 so rmdir(recorded) fails EACCES while ReadDir(recorded)
still succeeds (empty). Asserts firstErr is non-nil and references
both the recorded path and the rmdir branch. Skipped when running
as root (chmod is bypassed for uid 0).
- TestCleanupSidecarsSurfacesEACCESWhenReadDirFailsToo: the must-fix
case — chmod parent 0o555 AND chmod recorded 0o000 so BOTH rmdir
and ReadDir fail. The surfaced error must be the ORIGINAL rmdir
failure, not the ReadDir one. Skipped on uid 0.
Local verification:
- go test ./internal/daemon/execenv/... — all green
- go test ./internal/daemon/... — all green
- go vet ./... — clean
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
billing-test-page.tsx (introduced in #3442) ships dozens of raw English
JSX labels and is explicitly documented as "not a finished UI — slated
for deletion when the real billing UI ships". Routing every label
through useT() and inventing throwaway translation keys would be
wasted churn that gets deleted in the same week the real UI lands.
Add a file-level eslint-disable so CI passes on every PR that touches
unrelated files. When the real billing UI replaces this page, this
file (and the disable) gets deleted together.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
CompleteTask / FailTask used to be fire-once. A 1-second upstream 502
burst would drop the call, then the immediate fail-fallback also 502'd,
leaving the task stuck in `running` forever and showing the agent as
"still working" in the UI.
Add a bounded retry around the two terminal callbacks: 4s, 8s, 16s,
32s, 64s backoff schedule (5 retries, ~124s ceiling), retrying only
on transient errors (5xx, 408, 429, transport-level) and bailing
immediately on permanent 4xx. Also fix a latent bug where a transient
complete failure would silently downgrade a successful run to a fail:
the fallback now triggers only on permanent errors. Server-side
CompleteTask / FailTask are already idempotent on "already terminal",
so replays from a retry are safe even if the prior 502'd response was
actually persisted.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(billing): test page consuming /api/cloud-billing/*
Stuffs every cloud-billing endpoint onto a single dev page so we can
verify the proxy + Stripe end-to-end flow without a designed UI.
Reachable at /<workspaceSlug>/billing — the page is account-level
data but lives under the workspace dashboard layout because that's
where the authenticated shell sits. No sidebar entry on purpose;
this is test-quality and meant to be deleted when the real billing
UI ships.
What's there:
* Balance card (GET /balance)
* Stripe-success polling banner — visible only when ?session_id=
is in the URL (Stripe substitutes it into checkout_success_url
on its way back). React Query refetchInterval polls every 2s
until the topup status reaches credited / failed / canceled,
then a 'Clear from URL' button calls navigation.replace(pathname).
* Buy section: server-authoritative price tier buttons (GET
/price-tiers) → POST /checkout-sessions → window.location to the
Stripe URL. We do NOT hard-code amounts on the frontend; tier
config lives in cloud's billing.price_tiers.
* Stripe Billing Portal button (POST /portal-sessions). Opens in a
new tab so the originating page stays put for easy verification.
Documented behaviour: 400 is expected for users with no Stripe
customer record yet.
* Three lists: transactions / batches / topups.
Plumbing:
* packages/core/types/billing.ts — interfaces mirroring the cloud
response shapes. Status / source / tx_type fields are typed
'string' rather than enum unions to match the schemas' z.string()
parsing (same convention as CloudRuntimeNode); the canonical
enum values are exported as separate type aliases for callers
that want to switch on them.
* packages/core/api/schemas.ts — 9 zod schemas + 7 EMPTY_ fallbacks,
all .loose() so a non-breaking cloud-side field addition doesn't
crash the parser.
* packages/core/api/client.ts — 8 methods using parseWithFallback,
matching the existing cloud-runtime shape.
* packages/core/billing/{queries,mutations,index}.ts — React Query
queryOptions + mutations. Notable choices: balance / lists are
NOT keyed on workspace (account-level data), and the
checkout-session polling stops automatically when status is
terminal so we don't poll forever after a user closes the tab.
* packages/core/package.json + packages/views/package.json — exports
map updated for @multica/core/billing and @multica/views/billing.
Verification:
* pnpm --filter @multica/core typecheck clean
* pnpm --filter @multica/views typecheck — only pre-existing
hast-util-to-html error in editor code (exists on main)
* pnpm --filter @multica/core test — 412 passing
* pnpm --filter @multica/views test — 877 passing, 1 failure
(editor/readonly-content) is also pre-existing on main, not
caused by this change
Out of scope: real production-quality billing UI; sidebar entry; i18n
strings; mobile app. This is a single test page; it gets replaced
when the real UI ships.
* fix(billing): refetch balance/lists when checkout polling reaches terminal
Closes the second-half of the Stripe-return race the previous commit
left dangling.
Symptom:
After Stripe redirects back with ?session_id=..., the banner polls
/checkout-sessions/{id} every 2s and the rest of the page (balance,
transactions, batches, topups) is fetched once on mount. The
webhook race means those four queries usually see pre-credit state
— but the banner is the only thing that keeps polling, so once it
reads 'credited' nothing else on the page knows. The user would
see 'Final status: credited' next to a stale balance card until
they manually refresh.
Fix:
Add useInvalidateBillingDataAfterCredit() in @multica/core/billing —
a hook returning a callback that flushes balance / transactions /
batches / topups (NOT the checkout-session itself; its
refetchInterval already terminated, refetching would just confirm
the same value). The Stripe-success banner runs this callback in
a useEffect keyed on terminal-status transition, so it fires
exactly once when the polling lands.
Strict scope is documented in the hook's JSDoc:
- balance/transactions/batches: only change at the 'credited'
transition (cloud writes ledger + batch + wallet in one DB tx)
- topups: changes on every terminal transition
- For 'failed' / 'canceled' we technically over-fetch the first
three; three cheap round-trips, simplifies the call site, fine
on a test page.
Effect dep is . terminal flips false→true at most once
per session id (the polling stops when terminal is true so the
data won't change again). If the user lands here with a session
that is already terminal (re-opened tab on a credited URL), the
effect still fires on first data load and we still re-fetch —
correct, the cached snapshot is just as stale in that case.
go build / pnpm typecheck / pnpm test clean (core 412 passing; only
pre-existing hast-util-to-html error in unrelated editor code on
views, same as on main).
* feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777, #3433)
When self-hosters set DISABLE_WORKSPACE_CREATION=true, POST /api/workspaces
returns 403 for every caller and the UI hides every "Create workspace"
affordance (sidebar, modal, /workspaces/new page, onboarding Step 2). This
closes the gap where ALLOW_SIGNUP=false still let any signed-in user open
an isolated workspace the platform admin couldn't see.
- server: new Config.DisableWorkspaceCreation, gate in CreateWorkspace,
workspace_creation_disabled in /api/config, Go tests.
- frontend: new workspaceCreationDisabled in configStore, hide sidebar
entry, swap NewWorkspacePage / CreateWorkspaceModal / onboarding
StepWorkspace to a "creation disabled, ask for invite" state when the
flag is on, EN + zh-Hans locale strings.
- ops: .env.example, docker-compose.selfhost, helm values + configmap,
SELF_HOSTING.md, SELF_HOSTING_ADVANCED.md, environment-variables docs
(EN + zh).
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): drive create path off workspaceCreationAllowed (#3433)
PR #3441 review: when DISABLE_WORKSPACE_CREATION=true and the user already
has a workspace, StepWorkspace still walked the resume copy (`headline_resume`
/ `lede_resume` mentioning "or start another") and `creatingActive` ignored
the flag, leaving a stale clickable create CTA possible if /api/config
arrived late.
Refactor StepWorkspace to derive a single `workspaceCreationAllowed`
boolean from the config store. It now drives:
- Initial `mode` state (defaults to "existing" when disabled + reusing so
the CTA is pre-armed for the only valid action).
- `creatingActive` so the footer CTA cannot fall back into the create
branch even mid-render.
- Eyebrow / headline / lede strings — adds
`creation_disabled_{eyebrow,headline,lede}_resume` (EN + zh-Hans) for
the disabled + reusing variant.
Tests: cover the three reachable shapes — flag off + no existing, flag on
+ no existing, flag on + existing.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2764 feat(agent): wire mcp_config through ACP runtimes (Hermes / Kimi / Kiro)
The MCP config Tab (#3419) already lets admins save mcp_config on an
agent, and the daemon plumbs it through to `agent.ExecOptions.McpConfig`
for every runtime. Claude and Codex consume it; the three ACP runtimes
(Hermes / Kimi / Kiro) ignored the field and hardcoded an empty
`mcpServers: []` in their `session/new` requests.
Add `buildACPMcpServers` to translate the Claude-style `{"mcpServers":
{"<name>": {...}}}` object-of-objects into the array shape ACP requires
(`[{name, command, args, env: [{name,value}, ...]}, ...]` for stdio;
`[{type, name, url, headers: [...]}, ...]` for http/sse), then pass the
translated array on `session/new` (all three) and `session/load` (kiro
resume). Malformed JSON fails the launch closed — same contract Codex's
`renderCodexMcpServersBlock` uses — so users see a real error instead of
silently running with no MCP servers. Individual unclassifiable entries
(no command, no url) are skipped with a warning so one bad row can't
take MCP down for the rest of the agent.
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2764 fix(agent): wire mcp_config through ACP resume + gate http/sse on capability
Addresses the two blockers Elon raised on #3439:
1. session/resume now carries mcpServers for Hermes and Kimi (Kiro's
session/load already did). Per the ACP Session Setup spec the resume
path re-attaches MCP servers, and without this a resumed task lost
access to MCP tools that a fresh task on the same agent would have
had. Pinned with new TestHermesResumeIncludesMcpServers and
TestKimiResumeIncludesMcpServers integration tests that inspect the
recorded wire request.
2. Added extractACPMcpCapabilities + filterACPMcpServersByCapability so
http/sse MCP entries get dropped (with a daemon-log warning naming
the entry) when the runtime's initialize response doesn't advertise
mcpCapabilities.http / .sse. Sending those entries to a stdio-only
runtime is a spec violation and reliably tanks session/new; now they
get filtered and the rest of the session still starts. Stdio entries
pass through unconditionally. Both backends wire the filter in right
after initialize so session/new and session/resume see the same
filtered list.
Also added TestKiroLoadIncludesMcpServersFromConfig — Elon flagged that
no test pinned "non-empty mcp_config actually reaches the wire" for
Kimi/Kiro, so the wire assertions go in for all three runtimes.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (MUL-2753)
InjectRuntimeConfig previously called os.WriteFile unconditionally, which
truncated whatever file lived at the same path. For the local_directory
project_resource flow the workdir is the user's own repo, so the agent
silently destroyed any repo-level CLAUDE.md / AGENTS.md / GEMINI.md the
first time it ran in that directory, and the daemon's local-directory
cleanup explicitly skips the user's path so the file was never restored.
Write the brief inside a marker block instead:
<!-- BEGIN MULTICA-RUNTIME (auto-managed; do not edit) -->
...brief...
<!-- END MULTICA-RUNTIME -->
writeRuntimeConfigFile handles three states:
- file missing -> create with just the marker block,
- file present, no marker block -> append the marker block at the end
(preserves user-authored content above), and
- file present, marker block already there -> replace the block body in
place so repeated runs don't grow the file unboundedly.
This is the short-term fix called out on MUL-2753. The sidecar question
(.agent_context/, .claude/skills/, .multica/project/resources.json) is
left for a follow-up — those files don't overwrite user content, just
litter the workdir.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): cleanup runtime config marker block after local_directory tasks (MUL-2753)
Address Elon's review on PR #3438:
1. Add `CleanupRuntimeConfig` and wire it into the daemon's task path so
`local_directory` runs excise the marker block on the way out. Without
it, a user's subsequent manual `claude` / `codex` / `gemini` run in
the same directory picks up the previous task's stale brief (issue
id, trigger comment id, reply rules) and acts on the wrong context.
Cloud workspace runs skip the cleanup — their scratch workdir is
wiped by the GC loop anyway.
2. If excising the block would leave the file empty / whitespace-only,
the file is removed so we don't leave behind a stub the user has to
delete by hand. Surviving user content is preserved byte-for-byte.
3. Harden the marker parser: search for the end marker strictly after
the begin marker. The previous `strings.Index` pair mishandled two
malformed cases —
- a stray end marker before any begin (e.g. user pasted a
documentation snippet showing the wire format) would cause
every run to stack another block, growing the file unboundedly;
- a half-block left by a previous crashed run would cause every
subsequent run to append a fresh block beneath the half-block.
The `locateMarkerBlock` helper now anchors the end search past the
begin offset, and treats "begin found, no end after" as "block runs
to EOF" so the next write replaces it cleanly.
Centralised the provider→filename mapping in `runtimeConfigPath` so
Inject and Cleanup can't drift past each other when a new provider is
added.
Tests cover: parser hardening (stray-end-before-begin idempotency,
half-block recovery), Cleanup happy path / file removal / no-op cases /
malformed half-block / per-provider mapping, and an end-to-end
inject→cleanup round trip that locks in byte-identical restoration of
the user's pre-injection file.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): byte-exact inject/cleanup round trip for runtime config (MUL-2753)
Address Elon's second-round review on PR #3438. The previous cleanup
relied on `TrimRight + "\n"` for trailing newlines and `TrimSpace == ""`
for file removal — both compensated for the inject path's "normalise
trailing newlines so there's always exactly `\n\n` before the block"
step, but they did so by mutating the user's bytes. The result was a
real diff on three boundary cases:
- file ended without a newline (`rules`) → cleanup added one;
- file ended with two or more newlines (`rules\n\n`) → cleanup
collapsed to a single newline;
- file pre-existed but was empty / whitespace-only → cleanup
deleted it.
Reshape the contract so the bytes inject adds are the exact bytes
cleanup removes, with no user-byte mutation in between:
- Define `runtimeManagedSeparator = "\n\n"` as a fixed managed
separator that inject always inserts (unconditionally — including
for files that already end in two or more newlines) between
pre-existing user content and the marker block.
- Inject's missing-file branch still writes the block alone (no
separator); that absence is the marker Cleanup uses to identify
"we created this file from scratch" and is the only condition
under which Cleanup is allowed to `os.Remove` the file.
- Cleanup detects `HasSuffix(pre, runtimeManagedSeparator)` and
strips exactly those bytes; whatever remains is written back
verbatim with no `TrimRight` / `TrimSpace`, so the pre-injection
bytes survive exactly.
The replace-in-place branch is untouched — the managed separator
established by the first inject lives in pre and survives across
subsequent runs, so byte-exactness is preserved through arbitrary
inject→inject→cleanup chains.
Tests:
- `TestInjectThenCleanupRoundTripByteExactBoundaries` parameterises
9 seed shapes (missing file, empty, whitespace-only, no trailing
newline, one trailing newline, two trailing newlines, many
trailing newlines, CRLF line endings, no final newline with
embedded blank lines) and asserts byte-identical round trip
across two full cycles.
- `TestInjectReplaceThenCleanupRestoresByteExact` covers the
replace-in-place branch for the same boundary seeds.
- `TestWriteRuntimeConfigFileAlwaysInsertsFixedManagedSeparator`
pins the new invariant at the source: regardless of seed shape,
inject emits `<seed><\n\n><marker block>` with no normalisation.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Real workdir paths are routinely long enough to push every other chip
off the transcript-dialog metadata row, leaving the row scrolling or
wrapping awkwardly. Turn the chip into a fixed-width button:
- max-w-[16rem] + truncate so the path tail gets a clean ellipsis no
matter the depth
- title attribute carries the full relative_work_dir for a hover peek
- click copies relative_work_dir to clipboard, icon flips to a green
Check for 2s as feedback
- swap FolderTree icon for the simpler Folder mark
The pre-existing privacy invariant is preserved unchanged: only the
server-cleaned relative_work_dir reaches the DOM / title / clipboard;
the absolute task.work_dir still never leaves the server response.
* feat(db): add unresolved comment count + list filter queries
Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(handler): support unresolved-only comment listing
Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(handler): plumb unresolved count + thread root into claim, gate comment resume
Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.
Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(daemon): inject unresolved-comments hint + resolve step into agent brief
Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.
Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cli): add comment list --unresolved and comment resolve command
Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(comments): since-delta new-comment hint + default-on comment resume
Simplifies the comment-triggered agent flow down to what's actually needed:
- New-comment awareness is now a pure time delta: the claim response carries
new_comment_count + new_comments_since (anchored on the prior run's
started_at, never completed_at so a long run can't miss comments). The
per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
comment" prompt guard defends against inheriting the prior turn's "Done."
marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
/ ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
--thread/--recent/--since flags stay in the CLI/API, just no longer explained
inline every turn.
Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Next.js types PNG imports as StaticImageData ({ src, width, height });
vite/electron-vite types them as plain string. Component is consumed by
both apps/web (Next.js) and apps/desktop (electron-vite), so a single
type can't satisfy both — last CI failed apps/web typecheck.
Normalise via unknown at the import site so neither side's narrower
type causes the other side's branch to collapse to never. assets.d.ts
declares the union; the component derives a plain-string src once at
module load.
* MUL-2771: feat(transcript): server-derived relative work_dir chip
Adds a privacy-safe `relative_work_dir` field to the agent task wire
shape so the transcript dialog can show where a task ran without
leaking the user's home directory. Standard tasks strip the daemon's
workspaces root to `<wsUUID>/<taskShort>/workdir`; local_directory
tasks fall back to the trailing two path segments (`repos/foo`),
which keeps enough context for the user to recognise the directory
without exposing $HOME or the username.
The derivation lives in `taskToResponse` so every endpoint that
serves a task — list, snapshot, claim, rerun, cancel, complete,
fail — fills the field consistently. taskToResponse now also
populates `workspace_id`, which the prior shape declared but never
set. shortTaskID mirrors execenv.shortID; a colocated test pins the
two helpers together so future daemon-side layout changes don't
silently degrade the chip into the local_directory fallback.
Replaces the front-end stripping attempt in PR #3379, which passed
issue_id where workspace_id was required and therefore rendered the
full absolute path on every standard task.
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2771: harden privacy guards on transcript work_dir chip
Address second-round review feedback from PR #3428:
1. Drop the `title={task.work_dir}` tooltip in the transcript dialog.
The visible chip was safe but native browser tooltips re-rendered the
absolute `/Users/<name>/...` on hover, leaking into screen shares,
screenshots, and recordings — defeating the stated goal of the chip.
The absolute path now never reaches the DOM (no title, aria, or data
attribute).
2. Replace the "tail two segments" fallback for local_directory paths
with explicit home-prefix stripping plus a basename-only final
fallback. The old behaviour leaked the username on shallow paths like
`/Users/alice/foo`, `/home/alice/project`, and `C:\Users\alice\foo`.
The new behaviour recognises common per-user home layouts on macOS,
Linux, and Windows (case-insensitive), strips them down to the
remainder, and falls back to the basename for any path under an
unrecognised root — a single segment can never carry the home prefix.
3. Align the Go and TypeScript field comments with the real fallback
policy so future readers see "strip home / basename" instead of the
outdated "tail two segments" description.
Tests: expanded `TestRelativeWorkDir` to cover shallow `/Users/...`,
`/home/...`, and `C:\Users\...` paths, the exact-home edge cases,
case-insensitive matching, and the non-home basename-only fallback.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(agent): add Antigravity runtime backend
Adds Google's Antigravity CLI (`agy`) as the 12th supported coding-tool
runtime, alongside Claude / Codex / Cursor / Copilot / Gemini / Hermes /
Kimi / Kiro / OpenCode / OpenClaw / Pi.
The CLI emits plain assistant text on stdout (no structured event
stream), so the backend streams stdout line-by-line as `MessageText`
events and accumulates the same text as the final `Result.Output`.
Session resumption uses `--conversation <id>`; because the conversation
UUID is not echoed on stdout, the daemon routes `--log-file` to a temp
file and recovers the id from the glog-formatted log lines.
MUL-2767
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): correct Antigravity capability contract from Elon review
- ModelSelectionSupported now returns false for antigravity. `agy` has no
--model flag and antigravityBackend deliberately drops opts.Model, so
the UI must render a disabled "Managed by runtime" picker instead of
an empty dropdown plus a silently-ignored manual-entry field. Also
stop seeding AgentEntry.Model from MULTICA_ANTIGRAVITY_MODEL — the
backend would silently ignore it.
- Antigravity skills now write to {workDir}/.agents/skills/, the CLI's
native workspace path (inherits Gemini CLI's layout per
https://antigravity.google/docs/gcli-migration). Previously they went
to the .agent_context/skills/ fallback that the CLI doesn't scan.
Runtime brief moves antigravity into the native-discovery branch and
local_skills.go points the user-level skill root at
~/.gemini/antigravity-cli/skills for Runtime → local skill import.
- Doc + UI comment sync: providers matrix / install-agent-runtime /
cloud-quickstart / agents-create / tasks (session-resume support) /
skills / README all now list Antigravity in the right buckets, and
the model-picker / model-dropdown comments cite antigravity (not the
stale hermes reference) as the supported=false example.
New tests: TestAntigravityModelSelectionUnsupported,
TestInjectRuntimeConfigAntigravity (native discovery wording),
TestWriteContextFilesAntigravityNativeSkills (.agents/skills/ landing,
.agent_context/skills/ NOT written).
Co-authored-by: multica-agent <github@multica.ai>
* feat(provider-logo): swap inline placeholder for real Antigravity PNG
Replaces the hand-drawn planet+arc placeholder with the official asset
shipped from Downloads. Stored next to the component; bundlers
(Next.js / electron-vite) resolve the PNG import to a URL string at
build time. Added a small assets.d.ts so packages/views' tsc accepts
PNG / SVG module imports — there was no prior asset usage in this
package to register the declaration.
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2764: feat(agents): add MCP config tab to agent detail page
Backend already stores `mcp_config` and the daemon forwards it to the
runtime CLI via `--mcp-config`; this only adds the UI entry point.
The new tab presents a JSON editor that pretty-prints the existing
config, validates the buffer on every keystroke, and saves through the
existing `PUT /api/agents/{id}` path. Clearing the editor sends
`mcp_config: null`, which the handler reads as "wipe the column" and
the daemon falls back to the CLI's own default.
When the caller can't see secrets (agent actor, or a non-owner
non-admin member), the server already returns `mcp_config: null` with
`mcp_config_redacted: true`; the tab renders a read-only "configured
but hidden" state in that case so a non-privileged member cannot
silently overwrite an admin-owned config by saving an empty editor.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agents): MCP tab — preserve in-flight edits + warn non-Claude runtimes
- Fix stale-editor sync: compare the local draft against the *previous*
original via a ref, so a background agent refetch updates an untouched
editor instead of being silently ignored. Without this, a draft equal to
the OLD original was treated as user-edited after the prop changed, and
the next Save would write the old config back over a concurrent admin
edit.
- Surface a notice inside the tab when the agent's runtime provider is not
Claude — today's daemon only forwards mcp_config via Claude's
--mcp-config, so saving on e.g. a Codex agent was silent but ineffective.
- Tests for both: rerender resyncs an untouched editor, rerender preserves
an in-flight edit, warning renders on non-Claude / hides on Claude.
MUL-2764
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2764: feat(agents): codex MCP support + hide MCP tab on unsupported runtimes
- Backend: codex.go now translates agent.mcp_config (Claude-style
`{"mcpServers": {...}}`) into `-c mcp_servers.<name>=<inline-toml>`
flags for `codex app-server`, so MCP servers configured in the UI
reach Codex's per-task config layer. Bad mcp_config JSON downgrades
to a warn-and-skip so it can't break the agent launch.
- Frontend: AgentOverviewPane hides the MCP tab when the agent's
runtime provider doesn't read mcp_config — only `claude` and `codex`
are supported today, every other provider sees no MCP tab. The
previous in-tab warning is removed (no longer reachable).
- New shared helper `providerSupportsMcpConfig` lives in
`@multica/core/agents` so views and any future caller share one list
of MCP-aware providers.
- Tests: new go-side coverage for stdio + url + multi-server inputs,
TOML string escaping, malformed-input fallback, and arg ordering vs
custom_args; new views-side coverage for which providers surface the
MCP tab. En + zh-Hans copy and parity test refreshed.
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2764: fix(agents): keep codex mcp_config secrets out of argv/logs
Move the agent's mcp_config from a `-c mcp_servers.<id>=<inline-toml>`
argv flag into a daemon-managed `[mcp_servers.*]` block inside the
per-task `$CODEX_HOME/config.toml`. mcp_servers.<id>.env is a documented
Codex config field and the UI already treats mcp_config as redacted for
non-admins; argv would have leaked those values into `ps aux` and the
`agent command` log line. The file is forced to 0600 to keep secrets in
the daemon owner's lane regardless of the seed file's mode.
Also drop user-supplied `-c/--config mcp_servers.*` entries from
custom_args. Codex `-c` is last-wins (verified against codex-cli 0.132.0),
so without filtering, a custom_args entry could silently shadow whatever
the MCP Tab saved.
Strip inherited `[mcp_servers.*]` tables from the per-task config.toml
when the agent has its own mcp_config, mirroring Claude's
`--strict-mcp-config`: avoids TOML "table already exists" errors on
name collisions and matches admin expectations that the MCP Tab is the
authoritative source for that task.
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2764: fix(agents): codex mcp_config three-state semantics + custom_args compat
Address the third review pass:
1. Distinguish nil vs present-but-empty mcp_config. `{}` and
`{"mcpServers":{}}` now count as "admin saved an explicit (empty)
managed set" — strip inherited user `[mcp_servers.*]` and pin an
empty managed marker block. Only SQL NULL / JSON `null` map to
"absent" and fall back to the user's global `~/.codex/config.toml`.
This aligns Codex with the API's three-state contract (omit / null
/ object) and with Claude's `--strict-mcp-config` semantics.
2. Fail closed on `ensureCodexMcpConfig` errors and on managed
mcp_config without CODEX_HOME. Previous warn-and-launch would
silently inherit the user's global MCP servers and look identical
to a successful apply — exactly the surprise the MCP Tab is meant
to remove.
3. Only filter `-c mcp_servers.*` from `custom_args`/`extra_args`
when the agent has a managed mcp_config. Pre-MUL-2764 agents that
configured MCP via custom_args keep working; once an admin opts
in via the MCP Tab the daemon owns the `mcp_servers` namespace
and overrides are dropped (last-wins safety).
4. Update mcp_config locale intro to mention $CODEX_HOME/config.toml
instead of the now-removed `-c mcp_servers.*` argv path.
Tests:
- Split `TestEnsureCodexMcpConfigEmptyInputsAreNoop` into
`TestEnsureCodexMcpConfigAbsentLeavesUserTablesAlone` (nil/null)
and `TestEnsureCodexMcpConfigEmptyManagedSetStripsUserMcp` (`{}`,
`{"mcpServers":{}}`).
- Add `TestEnsureCodexMcpConfigEmptyManagedSetIdempotent` to pin
byte-identical reruns on the empty managed marker block.
- Add `TestHasManagedCodexMcpConfig` covering the eight relevant
inputs.
- Add `TestBuildCodexArgsPreservesCustomMcpOverridesWhenUnmanaged`
and `TestBuildCodexArgsDropsCustomMcpOverridesWhenManaged` to
pin the new gating.
- Add `TestCodexExecuteFailsClosedWhenMcpConfigInvalid` and
`TestCodexExecuteFailsClosedWhenManagedMcpButNoCodexHome` for the
Execute paths.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
GH#3405 / MUL-2768. Self-host docs already point at the SMTP path, but on-prem operators ran into two gaps:
- The Option B env block in auth-setup and self-host-quickstart only showed a 587 authenticated example, with no copy-pasteable block for the most common Exchange "anonymous internal relay on port 25" pattern, and no explicit mapping between port / auth / TLS / supported-or-not.
- troubleshooting "Emails not received" only covered Resend; SMTP failures (smtp dial / starttls / auth / MAIL FROM / RCPT TO / DATA) surface as wrapped errors in the backend logs, but operators had no doc telling them which Exchange-side fix maps to each.
Adds:
- A relay-mode table (anonymous 25 / authenticated 587 / 465 still unsupported) and two copy-pasteable env blocks in both auth-setup.mdx and self-host-quickstart.mdx (EN + ZH).
- Explicit note on the EmailService startup log line so operators can confirm SMTP is the active provider after restart, without leaking credentials.
- An SMTP failure-mode table in troubleshooting.mdx (EN + ZH) keyed on the exact wrapped error string, with the Exchange-side fix for each.
No code changes; env variable surface unchanged (still SMTP_HOST / SMTP_PORT / SMTP_USERNAME / SMTP_PASSWORD / SMTP_TLS_INSECURE). Port 465 stays "not supported" pending #3340.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
cleanupDeletedIssueCaches now also calls a new
useRecentIssuesStore.forgetIssue(wsId, issueId) action so the persisted
Recent Issues bucket no longer keeps deleted ids around. Both the delete
mutation and the WS delete event flow through the same cleanup, so this
covers self-delete and cross-client delete. Without this, Cmd+K fires a
detail query for every recent id on open and returns a steady stream of
404s for issues the user has deleted (#3413).
MUL-2765
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
The `!data/**` negation that rescues apps/mobile/data/ from the
root .gitignore's `data/` rule was inadvertently pulling .DS_Store
back in too — Finder metadata kept showing up in git status. Restate
.DS_Store after the negation so last-match-wins re-ignores it.
Pi is installed on Windows via npm, which lays down `pi.cmd` → `pi.ps1`
→ `node_modules/@mariozechner/pi-coding-agent/dist/cli.js`. The daemon
spawns Pi with `exec.Command("pi", ...)`; PATHEXT resolves that to
`pi.cmd`, and cmd.exe expands `%*` in the shim by re-tokenising the
original command line, which truncates any argv containing newlines.
buildPiArgs passes the full prompt as the last positional argv, so the
multi-line system+user prompt is silently cut at the first newline
before it reaches the JS entrypoint. The session JSONL then records
only the first line ("You are running as a chat assistant for a Multica
workspace.") and Pi replies as if the user message were missing
(GitHub multica-ai/multica#3306).
Mirror the existing cursor-agent fix: when LookPath resolves Pi to a
.cmd/.bat launcher and a sibling pi.ps1 exists, invoke PowerShell with
`-File <ps1>` directly and forward each arg as a discrete token. This
keeps us on the official launch path while skipping the cmd.exe %*
re-expansion. Falls back to the original launcher when pi.ps1 or
PowerShell can't be located.
The Windows test asserts the rewrite produces the expected argv and
that the multi-line positional prompt survives unchanged.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2744: feat(auth): auto-renew daemon PAT in-place within 7-day window
Daemons currently hold a 90-day PAT and have no renewal path: once the
token's expires_at passes, every request 401s and the user has to find
the silent failure in the daemon log and re-run `multica login`.
This adds an in-place renewal:
- New `POST /api/tokens/current/renew` (Auth-protected, mul_ only). The
server checks remaining lifetime: ≥ 7 days is a no-op; < 7 days bumps
expires_at to now + 90 days via a guarded UPDATE that makes concurrent
renews idempotent (the WHERE expires_at < $2 clause means only one
writer wins; the loser sees pgx.ErrNoRows and reports the already-
extended value). No raw token rotation — the same secret stays in
every CLI/daemon process sharing the config.
- Daemon-side `tokenRenewalLoop`: fires once on startup (covers
machine-was-off cases) and then every 3 days. With a 7-day server
threshold this gives at least two renewal attempts before the window
closes, so a single network blip can't push the token out.
- 401 fallback: when the renew call comes back 401 (token already
revoked/expired), the daemon logs a user-actionable WARN telling the
operator to run `multica login` — instead of the current silent
failure mode. Loop keeps running so the warning repeats until fixed.
PAT cache (auth.AuthCacheTTL = 10m) doesn't need invalidation: the next
miss after the UPDATE re-reads the row and re-caches with the bumped
TTL automatically.
Co-authored-by: multica-agent <github@multica.ai>
* MUL-2744: fix(auth): renew PAT before first sync; CAS against renewal threshold
Addresses the two issues Elon raised on #3360.
Must-fix: if the PAT is already revoked/expired when the daemon starts,
syncWorkspacesFromAPI 401s and Run returns before the background
tokenRenewalLoop ever fires its initial renewal. The operator only sees
a generic auth failure in the workspace-sync log with no hint that
'multica login' is the fix. Now the startup path runs an inline
tryRenewToken first, surfacing the existing 401 WARN before anything
else gets a chance to fail. Pulled the renew + first-sync pair into
preflightAuth so the ordering invariant is enforced at one site and
tests can exercise the failure modes without spinning up the full Run
setup. Removed the redundant initial tryRenewToken from
tokenRenewalLoop — startup now owns the first call.
Nit: the previous WHERE clause on ExtendPersonalAccessTokenExpiry
(expires_at < $2) did not actually make concurrent renews idempotent
the way the comment claimed. Two callers race-computing
$2 = now + 90d produce strictly-different values, and the second
writer's $2 always exceeds the row the first writer just wrote, so the
UPDATE re-matches and bumps again. Switched to a CAS against the
renewal threshold (expires_at <= $renew_threshold_at, i.e. now + 7d):
once writer A pushes expires_at past the threshold, writer B's UPDATE
matches zero rows and the loser falls back to reporting the
already-extended value as a no-op.
Tests:
- TestPreflightAuth_RenewsBeforeWorkspaceSyncOnExpiredToken locks in
the call ordering — renew endpoint is hit before workspaces, and the
re-login WARN appears even though both endpoints 401.
- TestPreflightAuth_SyncProceedsWhenRenewIsNoOp covers steady-state
startup: a renew=false no-op must still progress to workspace sync.
- TestPreflightAuth_TransientRenewFailureDoesNotBlockStartup covers a
500 from the renew endpoint — startup must continue, no WARN.
- TestRenewPAT_ParallelRenewExtendsExactlyOnce fires N=8 concurrent
renews at one row and asserts exactly one returns renewed=true with
the others reporting the same already-extended expires_at, plus the
DB carries only that single bumped value.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
The "Agent will work in-place at …" banner used to render directly above
the comment input, which made it read like an input adornment. Move it
to the top of the Activity section (below the "Activity" heading, above
the live agent card) so it reads as section context instead of composer
chrome. Update the component's JSDoc and tweak the margin (mb-2 → mt-3)
to match the new placement.
MUL-2752
* fix(editor): fall back to literal paste when markdown parser drops all content
When pasting text like `<T>` or `<MyComponent>`, the CommonMark-compliant
markdown parser treats them as inline HTML tags. ProseMirror's schema doesn't
recognize unknown HTML elements, so they are silently dropped — producing an
empty document from non-empty input.
Detect this case (non-empty input → empty parse result) and fall back to
literal text insertion so the user sees their text instead of nothing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(editor): escape non-standard HTML tags in paste to prevent content loss
When pasting mixed content containing multiple <tag> patterns (e.g.
"<t>\n裸 `<tag>` 做转\n<tag>\n<t>"), CommonMark treats bare <word>
as inline HTML. ProseMirror silently drops unknown HTML elements,
causing partial content loss. The previous empty-result fallback only
caught the single-tag case where the entire parse result was empty.
Pre-process paste text before markdown parsing: escape <tag> patterns
whose tag name is not a standard HTML element, while respecting inline
code spans and fenced code blocks. Standard HTML (div, br, img, etc.)
passes through normally.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(editor): preserve raw html-like text on paste
* fix(editor): prefer rich html paste when semantic
* fix(editor): avoid native paste when html drops raw tags
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
lowlight.highlightAuto() returns a Root with zero children (relevance 0)
for content it cannot classify into any language. toHtml() of that tree
produces an empty string, so dangerouslySetInnerHTML rendered a blank
<code> element — the <pre> background was visible but the text was gone.
Fall through to plain text render when toHtml produces nothing.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
The current cursor-agent CLI no longer has a 'chat' subcommand. The
positional 'chat' argument was silently treated as prompt text, leaking
into the user message (e.g. 'chat <actual prompt>').
Remove 'chat' from buildCursorArgs so the generated argv matches the
current cursor-agent CLI interface.
Fixes#3077
* MUL-2748 docs(autopilots): document webhook event filters and link from UI
Follow-up to PR #3231. The webhook event filters feature shipped
without user-facing docs and the UI section gave no hint about how
event/action are derived from inbound requests.
- Add an "Event filters" subsection to autopilots.mdx and .zh.mdx
under the existing "Trigger from a webhook" section: what the
filter does (with the `event_filtered` outcome), where the event
name and action come from (body envelope / headers / body
fallbacks), examples, a non-string-action gotcha, and a curl
recipe verifying both the allowed and filtered paths.
- Add a small ExternalLink icon next to the "Event filters" label
in WebhookEventFilterSection that opens the docs section in a new
tab. Locale-aware: zh users land on the Chinese page anchor, en
users on the English one.
Co-authored-by: multica-agent <github@multica.ai>
* docs(autopilots): expand "where event name and action come from" with curl examples
Add concrete curl request/inference pairs under each derivation step
(body envelope, headers, body fallback, default) and a "common gotcha"
explaining why filter `event=trigger, action=true` does not match
`{"trigger": true}`. Mirrors the explanation that resolved the
on-call confusion about Event filter semantics.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(avatar): normalize relative avatar urls in desktop/web
Co-authored-by: multica-agent <github@multica.ai>
* fix: test
Co-authored-by: multica-agent <github@multica.ai>
* fix(avatar): normalize avatar url in AvatarPicker preview
MUL-2746. The picker is used by create-agent and create-squad, and also
prefills from a template's `avatar_url` when duplicating an agent. The
upload result / template URL is root-relative in local-storage setups,
so on Desktop (file:// runtime) the preview <img> resolves against the
local filesystem and the avatar fails to render. Route the value through
`resolvePublicFileUrl` for rendering only; the stored URL stays raw so
the parent's create call still posts what the backend expects.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: J (Multica agent) <agents@multica.ai>
* feat(autopilots): webhook event filters per trigger (MUL-2334 follow-up)
Adds schema-backed event/action filtering to webhook triggers so operators
can declare exactly which GitHub (or generic) events should spawn autopilot
runs. Events outside the declared scope are recorded as ignored with reason
'event_filtered' — visible in the delivery log but without expensive run/task
creation.
Closes#3093 (supersedes the description-parsing approach from that PR).
Backend:
- Migration 108 adds event_filters JSONB to autopilot_trigger
- sqlc queries updated for CREATE / UPDATE / LIST / GET
- HandleAutopilotWebhook filters against trigger.event_filters before dispatch
- Create/Update trigger handlers accept event_filters in the request body
- Response shape includes event_filters so the UI can render it
Frontend:
- New WebhookEventFilterSection component in the autopilot dialog
- Inputs for event name + comma-separated actions
- i18n strings added (en + zh-Hans)
Tests:
- Unit tests for splitWebhookEvent and webhookEventAllowedByTriggerScope
- Handler-level integration tests for filtered / allowed / no-filter paths
co-authored-by: ZephaniaCN <agent/autopilot-webhook-filter>
* fix: recognize gitlab/bitbucket/gitea as providers in splitWebhookEvent
TestSplitWebhookEvent failed because only 'github' was recognized as a
provider prefix. Extract isKnownProvider() to handle gitlab, bitbucket,
and gitea as well.
* fix(autopilots): address PR #3231 review for webhook event filters
Must-fix from PR #3231 review:
1. event_filters now uses typed []WebhookEventFilter at the HTTP boundary
instead of []byte. encoding/json was base64-encoding the field on the
way out, so the UI could not .map() the response, and a real JSON
array on the way in failed to decode. Response field also decodes the
stored JSONB into a typed slice before serialising back.
2. UpdateAutopilotTriggerRequest.EventFilters is *[]WebhookEventFilter
with tri-state PATCH semantics: nil pointer = leave alone, [] =
clear, [...] = replace. The handler marshals an explicit empty slice
to the JSONB literal `[]` so COALESCE overwrites instead of preserves.
AutopilotDialog now PATCHes the webhook trigger when event_filters
change in edit mode (previously the toast said "updated" while the
backend was unchanged).
3. webhookEventAllowedByTriggerScope no longer short-circuits to false
on the first event-name match whose actions don't line up. Earlier
code silently shadowed any later filter that shared the same event
name with disjoint actions.
Robustness: validateWebhookEventFilters rejects empty event names /
actions at write time, and the matcher fails closed on malformed stored
bytes instead of widening the allowlist.
Tests: handler tests now post real JSON arrays (the prior []byte path
masked the contract bug). Adds round-trip / clear-with-[] / preserve-
when-omitted / replace / invalid-filter / filters-on-schedule coverage,
plus matcher tests for same-event multi-filter and malformed-deny.
Migration renamed 108 → 110 to avoid colliding with main's
108_task_token (came in via the merge from main).
Follow-up nits from PR #3324 review:
- Export DefaultAutopilotTriggerTimezone so the autopilot scheduler reuses
the same source-of-truth as the service layer instead of hardcoding "UTC"
in two places.
- Add tests that lock down the invalid-timezone fallback (e.g. "Foo/Bar")
for both buildIssueDescription and interpolateTemplate, so a future change
to the resolve/format helpers can't silently emit a half-formatted
timestamp or date.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(auth): support mcn_ Cloud Node PATs verified via Fleet
Adds a new token kind, mcn_ (multica cloud node), recognized in both
the regular Auth and DaemonAuth middlewares. mcn_ tokens are minted
and owned by Multica Cloud (not the local personal_access_tokens
table); the server validates them by POSTing to the Fleet's
/api/v1/pat/verify endpoint and uses the returned owner_id as
X-User-ID for downstream handlers.
Cloud is the authoritative owner of token status, so this is a
verifier-only path with no DB fallback:
* Fleet says valid:false -> 401 (token genuinely bad)
* Fleet unreachable / 5xx -> 503 (transient, retry)
* No MULTICA_CLOUD_FLEET_URL configured -> 401 (fail closed)
Verification results are cached in Redis for 60s under
mul:auth:mcn:<sha256> to bound the per-request load on Fleet without
extending the revocation window beyond what the Cloud doc allows.
Negative results are NOT cached, so a freshly minted token doesn't
get locked out by a stale 'token_not_found'.
Reuses MULTICA_CLOUD_FLEET_URL (the same env the cloud-runtime proxy
already uses) so deployments don't need a second config knob.
Tests cover the happy path, every documented invalid reason, 4xx/5xx
mapping, network error, decode error, ctx cancellation, the
fail-closed valid:true-without-owner_id case, trailing-slash URL
normalization, and the Redis cache short-circuit + negative
no-cache contract. Middleware tests pin the four 401/503/200 outcomes
in both Auth and DaemonAuth.
* auth(mcn): require owner_id to map to a real local user; drop X-User-PAT plumbing
Two related changes:
1. Cloud-verified owner_id is now checked against our local users table.
The Cloud owner_id and our users.id share the same UUID space by
contract; a missing local user means either the row was deleted
under an active node or something is forging owner_ids — either
way, fail closed.
CloudPATVerifier.Verify takes a new OwnerLookupFunc:
- returns (true, nil) -> success, cache + return
- returns (false, nil) -> ErrCloudPATInvalid (reason='owner_unknown'),
NOT cached (so a freshly-created user
doesn't get locked out for a TTL window)
- returns (_, error) -> ErrCloudPATUnavailable (transient,
middleware emits 503)
Both Auth and DaemonAuth wire ownerLookupFor(queries), a new shared
helper that wraps queries.GetUser, mapping pgx.ErrNoRows / unparseable
UUIDs to (false, nil) and other errors to a real Go error.
2. Removed all X-User-PAT plumbing. Cloud now mints node-scoped mcn_
PATs itself during /api/v1/nodes (see multica-cloud
docs/api/node-pat.md) and ships them into the EC2 instance via SSM,
so multica-api no longer needs to forward the caller's mul_ PAT.
Propagating a long-lived user PAT into a remote machine widened
the blast radius of any node compromise; that's gone now.
Removed:
- cloud_runtime.go: withUserPAT option, cloudRuntimeUserPAT,
generateCloudRuntimePAT, revokeGeneratedPAT
- cloudruntime/Request.UserPAT field + X-User-PAT header
- X-User-PAT from CORS allowed headers
- obsolete handler tests:
TestCreateCloudRuntimeNodeForwardsValidatedPAT
TestCreateCloudRuntimeNodeRejectsUnownedPAT
TestCreateCloudRuntimeNodeRejectsExpiredPAT
TestCreateCloudRuntimeNodeAutoGeneratesPAT
replaced with TestCreateCloudRuntimeNodeForwardsBody
- X-User-PAT references in packages/core/api/client.test.ts
Tests:
* 3 new verifier-level tests (owner_unknown not cached, lookup error
-> Unavailable, success path is cached for both fleet AND lookup)
* 5 new owner_lookup_test.go tests (nil queries, existing user,
missing user, malformed UUID, DB error)
* 1 new end-to-end DaemonAuth test (cloud says valid, no local user
-> 401)
* Existing X-User-PAT TS assertions removed; full vitest run passes.
* go test ./... and go vet ./... clean on the server module.
* docs(project-resources): document local_directory resource type
Add user-facing guidance for the local_directory project resource introduced
in #3283 — when to pick it over github_repo, the Desktop / CLI attach flow,
path validation rules, the daemon-scoped one-per-(project, daemon) limit,
serial task execution + waiting_local_directory status, what the daemon will
and won't touch in the user's folder, and the v1 limits to call out
(no auto branch switch / commit / PR; dirty tree carried through).
Also ship the missing Chinese counterpart of project-resources and wire it
into meta.zh.json.
MUL-2618
Co-authored-by: multica-agent <github@multica.ai>
* docs(project-resources): cover write-conflict trade-off + mixed-resource behavior
Expand the local_directory docs in response to review feedback:
- Restate "when to pick local_directory" as two distinct use cases (clone
cost; fine-grained changes needing frequent local review) instead of a
one-liner, and make the trade-off explicit: v1 ships no file-level write
lock, so the per-directory serial gate is the only protection against
cross-issue agents touching the same files.
- Add a new "Mixing resource types, and multiple local_directory resources"
section that answers: github_repo + local_directory on the same project
(local takes precedence on the bound daemon, github_repo falls back
everywhere else), and two local_directory resources (only possible
across two daemons, routed by the agent's runtime assignment, no
load-balancing).
Mirrored into the Chinese translation. typecheck + tests still pass.
Co-authored-by: multica-agent <github@multica.ai>
* docs(project-resources): tighten local_directory wording per review
- Soften "only the bound daemon can take tasks" to "only the bound
daemon uses this local directory" (zh) — aligns with the
mixed-resource fallback section where other daemons still run.
- Clarify that local_directory does not create/use a github_repo
worktree for that task (en + zh); the per-workspace repo cache may
still sync as a background behaviour.
- Match implementation for the Desktop "Add local directory" button:
it stays visible but is disabled with a hint when daemon is offline
or the per-daemon limit is reached; only the web app hides it
outright (en + zh).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(comments): align UpdateComment post-processing with CreateComment (#2965 follow-up)
Part 1 — PR #2965 code review follow-ups:
- Fix sqlc Column3 naming → AttachmentIds via sqlc.arg(attachment_ids)
- Return 500 on ReplaceCommentAttachments failure instead of logging + 200
- Remove optional marker from onEdit attachmentIds (always passed)
- Add optimistic update for attachments in useUpdateComment
- Extract useEditAttachmentState hook from CommentRow/CommentCardImpl
- Add integration tests for attachment replacement scenarios
Part 2 — Edit-comment logic alignment:
- Add ExpandIssueIdentifiers to UpdateComment (bare identifiers now expand)
- Add handleEditMentionDiff: diff old vs new agent/squad mentions on edit,
cancel tasks for removed mentions, enqueue tasks for added mentions,
cancel + re-trigger when content changes but mentions are unchanged
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(sqlc): regenerate with v1.31.1 + add mention diff integration tests
Fixes sqlc version downgrade (v1.31.1 → v1.30.0) that was introduced
when the original PR was authored with a local v1.30.0 binary.
Regenerated all sqlc output with v1.31.1 to match main.
Adds integration tests for handleEditMentionDiff covering: edit adds
mention → task enqueued, edit removes mention → task cancelled, edit
changes content with same mentions → cancel + re-trigger.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* refactor(comments): simplify edit post-processing to cancel-all + re-trigger
Replace handleEditMentionDiff (120-line mention diff) with a simpler
model: when content changes, cancel all tasks triggered by this comment,
then re-run the same three trigger paths as CreateComment (assignee,
squad leader, mentions). Fixes gap where assignee/squad-leader tasks
were not cancelled or re-triggered on edit.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* refactor(comments): extract triggerTasksForComment to unify Create/Edit trigger paths
Create and Edit duplicated the same three trigger paths (assignee,
squad leader, mentioned agents). A fourth path would need changes
in two places. Extract into a shared function so the composition is:
Create: trigger() + unresolve()
Edit: cancel() + trigger()
Delete: cancel()
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* ci: split mobile lint/typecheck out of frontend job
Mobile lint (~38s) + typecheck (~13s) ran on every web/desktop PR even
though mobile has no vitest suite and main CLAUDE.md already promises a
parallel mobile-verify workflow. Excluding @multica/mobile from the
frontend turbo filter pulls those 50s off the critical path, and the new
mobile-verify.yml runs them in parallel only when apps/mobile/** or
packages/core/types/** changes.
MUL-2729
Co-authored-by: multica-agent <github@multica.ai>
* ci(mobile-verify): broaden path filter to cover real mobile deps
The initial filter only watched `apps/mobile/**` and
`packages/core/types/**`, but mobile imports runtime modules from many
more `@multica/core/*` paths (agents, markdown, permissions,
api/schemas, etc.). PRs that touched only those subtrees would skip
main CI (via `--filter='!@multica/mobile'`) AND skip Mobile Verify — a
coverage regression vs. the pre-split CI.
Expand paths to:
- `packages/core/**` (covers every importable subpath)
- root install/turbo configs that affect mobile build:
`package.json`, `pnpm-lock.yaml`, `pnpm-workspace.yaml`, `turbo.json`
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(create-issue): preserve parent_issue_id through Create with agent flow (MUL-2534)
When the create-issue modal was opened from the "Add sub issue" entry on
an existing issue and the user switched to "Create with agent", the
parent_issue_id was silently dropped: switchToAgent only forwarded
prompt + actor + project_id, the AgentCreatePanel had no notion of
parent context, and the daemon prompt never instructed the agent to
pass --parent <uuid>. The sub-issue intent was lost and the new issue
landed as a standalone.
This fix threads parent_issue_id through the whole pipeline silently —
no new editable form field, the existing carry channel handles it:
- Frontend: ManualCreatePanel.switchToAgent + AgentCreatePanel.switchToManual
now carry parent_issue_id (and identifier, for display) so the sub-issue
intent survives mode flips in either direction. AgentCreatePanel reads
parent from `data`, forwards to api.quickCreateIssue, and renders a
read-only "Sub-issue of MUL-XX" chip so the user can see the relationship.
- API: quickCreateIssue accepts optional parent_issue_id.
- Backend: QuickCreateIssueRequest validates parent_issue_id belongs to the
same workspace (same path as CreateIssue), persists it in
QuickCreateContext, and the daemon claim handler resolves the parent's
identifier for prompt context.
- Daemon prompt: when ParentIssueID is set, buildQuickCreatePrompt instructs
the agent to pass `--parent <uuid>` and treat the modal entry point as
authoritative.
Tests cover all three hops: switchToAgent carry payload, AgentCreatePanel →
api.quickCreateIssue, and the daemon prompt's --parent injection (with both
identifier-present and UUID-only fallback branches).
Co-authored-by: multica-agent <github@multica.ai>
* test(create-issue): cover quick-create parent trust boundary + identifier fallback (MUL-2534)
Address review on PR #3083:
- Add server-side test for POST /api/issues/quick-create parent_issue_id:
same-workspace parent threads through QuickCreateContext.ParentIssueID,
foreign-workspace and bogus UUIDs return 400 and never enqueue a task.
- Fall back to `data.parent_issue_identifier` in ManualCreatePanel's
switchToAgent when the parent detail query hasn't hydrated yet, so the
agent chip never renders "Sub-issue of " with an empty tail.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* feat(project): add local_directory project_resource type (MUL-2662)
Adds a second project_resource type alongside github_repo so a project
can be pinned to an existing directory on a specific daemon (the v1 of
the local-working-directory flow tracked in MUL-2618). The ref schema is
{ local_path, daemon_id, label? }; local_path must be absolute and
daemon_id is required. The same (daemon_id, local_path) pair is allowed
on multiple projects by design — no UNIQUE constraint is added.
Implementation reuses the existing project_resource API surface: the new
type is wired through the validator switch with no migration, no new
events, and no daemon-handler changes (daemon already passes through
arbitrary resource types via ProjectResources). The CLI gains
--local-path / --daemon-id / --ref-label shortcuts so
`multica project resource add --type local_directory` mirrors the
existing `--type github_repo --url ...` ergonomics; the generic --ref
flag still works for both types.
Tests cover the full CRUD lifecycle, the same-path-across-projects
allowance, the same-path-same-project conflict, the validator rejections
(missing/blank/relative path, missing daemon_id, wrong payload type),
and the cross-platform isAbsoluteLocalPath helper.
Co-authored-by: multica-agent <github@multica.ai>
* feat(project): add update endpoint + label-shadow guard for project_resource (MUL-2662)
Addresses the Elon review on PR #3263:
- Add PUT /api/projects/{id}/resources/{resourceId} with sqlc query,
matching handler, CLI `project resource update`, and a new
EventProjectResourceUpdated WS event. resource_type stays immutable;
ref/label/position are all individually optional.
- Catch same-project (daemon_id, local_path) collisions where only the
embedded label differs — the row-level UNIQUE only matches the full
ref JSON, so a label typo would otherwise let the same working
directory bind twice.
- Tests cover the update lifecycle (label-only / ref / clear / 404 /
invalid path) and the label-shadow conflict on both create and
update; the in-place rename still succeeds because the conflict
scan ignores the row being edited.
Incidental: regenerating sqlc picked up a missing skills_local scan in
UpdateAgentCustomEnv that drifted in from #3200.
Co-authored-by: multica-agent <github@multica.ai>
* fix(project): close bundled-create label-shadow gap + merge resource_ref on CLI update (MUL-2662)
Two follow-ups from MUL-2662 review round 2:
- CreateProject inline resources path now dedupes local_directory entries on
(daemon_id, local_path) before opening the transaction. The DB-level
UNIQUE(project_id, resource_type, resource_ref) constraint only fires on a
full JSON match, so two rows with the same target but different `label`
would otherwise slip past. Standalone POST/PUT already cover this via
findLocalDirectoryConflict; bundled create was the missing surface.
- `multica project resource update` now seeds resource_ref from the existing
row before applying per-type shortcut flags, so `--default-branch-hint x`
on its own no longer constructs a payload missing `url` (which the server
400s on). Local_directory partial edits get the same merge behavior.
Co-authored-by: multica-agent <github@multica.ai>
* feat(desktop): local_directory project_resource UI (MUL-2665) (#3273)
* feat(desktop): local_directory project_resource UI (MUL-2665)
First UI surface for the local-working-directory flow tracked in MUL-2618.
Lets users on the desktop pin a project to an existing folder on this
machine; web stays read-only since the per-daemon check can't be done in
the browser.
What's new for the renderer:
- ProjectResourcesSection grows a desktop-only "Add local directory"
button next to the existing GitHub-repo popover. Clicking it opens
Electron's native folder picker, validates the path through a new
IPC pair (existence + r/w), and submits a project_resource of
resource_type=local_directory with daemon_id pulled live from
daemonAPI.getStatus.
- LocalDirectoryRow renders the rename pencil + path tooltip, and
greys out when ref.daemon_id != this machine's daemon_id (with a
"only available on the machine that registered this directory"
tooltip). Delete stays enabled so users can drop stale registrations
from any device.
- LocalDirectoryHint sits above the issue-detail comment composer and
shows "Agent will work in-place at {label} ({path})" when the issue's
project has a local_directory matching this daemon. Hidden on web.
- TaskStatusPill picks up a new "waiting_for_directory_release" stage
that the daemon will publish when it dequeues a task but can't
acquire the path lock. The render is in place now so the daemon
sibling subtask can wire the status string without an additional UI
PR.
Plumbing:
- @multica/core/types gains LocalDirectoryResourceRef +
UpdateProjectResourceRequest, and the api client gets the matching
PUT method backed by the server endpoint that landed in
2ac3faebb (MUL-2662). A useUpdateProjectResource hook drives the
in-place label edit.
- New Electron handlers under apps/desktop/src/main/local-directory.ts:
local-directory:pick -> dialog.showOpenDialog (openDirectory)
local-directory:validate -> stat + access(R_OK + W_OK)
exposed through the preload as desktopAPI.pickDirectory /
validateLocalDirectory. View code talks to them via a thin
packages/views/platform helper that returns reason=unsupported on
web instead of crashing.
- useLocalDaemonStatus exposes the local daemon's id, device name, and
running flag from daemonAPI.onStatusChange so the renderer can do the
cross-device match without coupling to the desktop preload typings.
Tests:
- pickStageKeys gets a unit test covering the new stage and proving
the directory-release status outranks availability hints.
- LocalDirectoryHint tests cover the four render branches (no project,
no daemon, foreign daemon, matching daemon).
- i18n parity stays green; new keys added under projects.resources.*
and chat.status_pill.stages.waiting_for_directory_release in both
locales.
Out of scope (will land separately):
- The daemon-side waiting/lock signal that flips the pill into the
new state.
- Adding local_directory to the create-project modal's bulk
attach flow.
- Docs page refresh for project-resources.mdx — left for the
MUL-2618 umbrella sweep.
Co-authored-by: multica-agent <github@multica.ai>
* fix(desktop): hide rename for foreign daemon local_directory rows (MUL-2618)
Address review nit on #3273: the rename pencil was gated only by
`canEdit`, so a foreign / unknown-daemon row still showed it even
though the spec says cross-device rows are disabled. Gate rename on
`!mismatch` so it disappears on those rows; delete stays available
so a stale registration can still be dropped from any device.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) (#3274)
* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663)
Wires up the daemon side of the local_directory project_resource introduced
in MUL-2662. When a task is dispatched against a project whose resources
include a local_directory pinned to this daemon's UUID, the daemon now:
- Validates the path (absolute, exists, daemon process can read+write,
not in the system-root / $HOME blacklist) and fails the task fast on
any precondition violation, with a user-readable reason.
- Serialises concurrent tasks on the same on-disk path via a
daemon-local LocalPathLocker keyed by symlink-resolved realpath. The
lock is held for the entire task lifetime (claim → context write →
agent → result report).
- When the lock is contended, the daemon flips the row to a new
waiting_local_directory status on the server (carrying a wait_reason
like "<path> (held by task <short id>)") so the UI can render
"等待本地目录释放" instead of leaving the row silently in dispatched
past the sweeper timeout. The status accepts being woken into running
once the lock is acquired.
- Sets execenv.WorkDir to the user's path (no copy, no mount). envRoot
still lives under workspacesRoot/<wsID>/ and hosts output/, logs/, and
.gc_meta.json — the daemon's logbook for the run.
- Stamps GCMeta.LocalDirectory=true so the GC loop never RemoveAlls
envRoot for these tasks (gcActionClean → gcActionCleanArtifacts,
gcActionOrphan → gcActionSkip). The user's directory was never under
envRoot to begin with, so this is defense in depth.
- Skips execenv.Reuse for local_directory tasks because the prior
WorkDir is the user's path and reusing it through that code path
loses the envRoot association the GC loop needs. Prepare is cheap
here (no clone, no copy), so always running it is fine.
Server-side protocol changes:
- New CHECK value 'waiting_local_directory' on agent_task_queue.status
plus a wait_reason TEXT column (migration 109).
- All cancel / active / counted-as-running / orphan-recovery queries
expanded to include the new status; FailStaleTasks intentionally
excludes it (the daemon owns the wait).
- New SQL MarkAgentTaskWaitingLocalDirectory(id, reason) and a relaxed
StartAgentTask that accepts both dispatched and
waiting_local_directory as preconditions (and clears wait_reason on
the way through).
- New POST /api/daemon/tasks/{taskId}/wait-local-directory endpoint,
TaskService.MarkTaskWaitingLocalDirectory broadcaster, and matching
daemon Client.MarkTaskWaitingLocalDirectory.
Tests cover: path blacklist + R/W enforcement, mutex serialisation +
ctx-cancelled wait, lock handover between two tasks, GC never returns
gcActionClean / gcActionOrphan for local_directory rows (with negative
control for the standard path), and Prepare/Cleanup correctly substitute
+ protect the user's WorkDir.
The desktop UI side (UI for adding a local_directory resource, surfacing
the "等待本地目录" badge) is MUL-2665; the agent-task lifecycle changes
(no branch switch, dirty-tree tolerant, auto-commit) are MUL-2664.
This PR targets the shared MUL-2618 v1 feature branch agent/j/912b8cb1,
not main; the whole v1 will be merged to main together when complete.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): tighten local_directory status, symlink, cancel handling (MUL-2618)
Address the 3 must-fix items from Elon's review of PR #3274.
1. Status string unified. The server / daemon publish
`waiting_local_directory`; align views, locales, and the
pickStageKeys test (PR #3273 had used `waiting_for_directory_release`
on a placeholder string). Without this, the daemon's wait state
never reached the pill once the two siblings merged.
2. validateLocalPath now also runs the blacklist against the
symlink-resolved realpath, with macOS's `/etc` -> `/private/etc`
redirect handled via `isBlacklistedRealPath` which compares
canonical forms. Without this, a symlink such as
`/Users/me/proj/home -> /Users/me` slipped the literal $HOME check
while every daemon write still landed in the user's home. Tests
cover symlink-to-home, symlink-to-system-root, and the negative
case (symlink to a regular subdirectory).
3. acquireLocalDirectoryLockIfNeeded now spins up a cancellation
watcher inside `onWait` (lazy — the fast path stays free) so the
gap between dispatch and StartTask responds to server-side cancel
or row deletion. If the watcher fires while the daemon is parked
on the path mutex, the lock-wait context is cancelled, Acquire
returns promptly, and the helper exits silently the same way the
run-phase poller does. New TestAcquireLocalDirectoryLock_CancelDuringWait
exercises the path end-to-end with a fake server.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): unconditional canonical blacklist + Windows drive-root generalisation (MUL-2618)
- validateLocalPath now always runs isBlacklistedRealPath on the
symlink-resolved path, not only when it differs from absPath. The old
guard let users type the canonical form of an OS-symlinked banned root
(e.g. /private/tmp, /private/etc, /private/var on macOS) straight
through, since EvalSymlinks is a no-op on already-canonical input.
- Windows drive-root rejection moved off the static C/D/E/F enumeration
onto filepath.VolumeName via a new isDriveRoot helper, so removable /
network drives mounted at G:..Z: and UNC \\server\share roots are also
blocked. systemRootBlacklist keeps the well-known C:\ trees only.
- Tests: macOS-only case exercises direct /private/{tmp,etc,var}; a
new TestIsDriveRoot covers the Windows generalisation (skipped on
POSIX runners by runtime guard).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* feat(views): wire waiting_local_directory end-to-end in issue UI + presence (MUL-2618)
Connect the daemon-emitted `task:waiting_local_directory` and `task:running`
events through to issue execution log, sticky agent banner, activity indicator,
and agent presence so a parked task is no longer invisible on the issue page.
- Add `waiting_local_directory` to `AgentTask.status` and the typed
`task:running` / `task:waiting_local_directory` WS event payloads.
- Chat realtime sync writes both new statuses into the pending-task cache so
the chat StatusPill flips out of a stale `dispatched` frame.
- ExecutionLogSection: count `waiting_local_directory` as active, add tone +
status label, treat parked tasks the same as dispatched for time anchor /
transcript visibility / terminate-confirm note.
- AgentLiveCard: subscribe to both new events, rank the parked state between
dispatched and queued, and surface a "is waiting for the local directory"
banner with the muted "Clock" treatment used for queued.
- IssueAgentActivityIndicator: route parked tasks into the queued bucket so
the hover stack and chip stay visible.
- derive-presence: parked tasks count toward `queuedCount` so the agent
workload chip stays out of `idle` while the daemon waits on the path lock.
- Locales: add `agent_live.is_waiting_local_directory` and
`execution_log.status_waiting_local_directory` (en + zh-Hans).
Co-authored-by: multica-agent <github@multica.ai>
* feat(project): enforce one local_directory per (project, daemon) (MUL-2618)
The daemon-side resolver picks the first matching local_directory by
daemon_id, so allowing two rows on the same daemon — even at different
paths — let the agent silently write into whichever sorted first. Tighten
the invariant top to bottom:
- server: `findLocalDirectoryConflict` rejects any second row sharing a
daemon_id, regardless of `local_path` or label. Bundled-create surface in
`CreateProject` runs the same daemon-scoped dedupe up front.
- daemon: `findLocalDirectoryAssignment` fails fast when it finds more than
one row pinned to the current daemon (older API client / direct DB
writes can still produce that state — refuse to guess).
- desktop UI: hide the "Add local directory" action once the current
daemon owns a row on this project, with a hint and a defensive toast on
the call path; foreign-daemon rows stay visible read-only as before.
- Tests:
* daemon: new `two local_directory rows on this daemon fail fast` /
`local_directory rows on different daemons coexist` cases.
* handler: rewrite the legacy `LabelShadow` cases as
`DaemonScopedConflict` / `BundledLocalDirectoryDaemonConflict` —
asserts 409 on same-daemon different-path, 201 on per-daemon bundles.
- Locales: en + zh-Hans copy for the new hint + toast.
Co-authored-by: multica-agent <github@multica.ai>
* chore(sqlc): drop stale skills_local in UpdateAgentCustomEnv (MUL-2618)
Follow-up to the main-merge in 0f8e8ca7: the auto-merge preserved most
of main's skills_local revert but kept the column reference inside the
UpdateAgentCustomEnv scanner because that block hadn't been touched by
either side. Re-running `sqlc generate` regenerates the file without
skills_local in this query, matching the rest of the file and the
post-revert schema.
Co-authored-by: multica-agent <github@multica.ai>
* feat(create-project): binary source picker — repos OR local directory
Turn the create-project dialog's "Repos" pill into a binary Source
picker. A project's source is mutually exclusive: either a set of
GitHub repos (worktree mode, default) or a single local working
directory (local mode, desktop-only). Mirrors the constraint the
backend will enforce next.
Behavior:
- Pill shows the active mode's selection (GitHub icon + repo count, or
folder icon + local label/path).
- Popover has a 2-tab segmented control at the top; the Local tab is
hidden entirely on web (local_directory needs a daemon_id).
- Local tab requires the daemon online — amber notice + disabled picker
when offline, re-renders automatically via useLocalDaemonStatus.
- Switching tabs preserves the other side's stash, but handleSubmit
only emits the resource matching the active sourceMode, so abandoned
picks never leak into the created project.
Backend mutual-exclusion validation + the resources-section
conditional-add-button still to come — this PR just unblocks the
dialog so it can be demoed.
* fix(mobile): cover waiting_local_directory in run row status maps (MUL-2618)
---------
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Multica J <j@multica.ai>
clearContent() and setIsEmpty() were called before await onSubmit(),
causing permanent content and draft loss on network failure. Move both
to the success path, consistent with attachment and draft cleanup.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
The bare "N" under the Repositories shortcut had no label and was
adjacent to a sentence ("Repository URLs live in the Repositories
tab") that has no semantic link to a number, so users read it as a
typo. The card is a navigation shortcut, not a status panel — the
actual count is visible after clicking through.
MUL-2725
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
This was the only `list` subcommand that printed a human-readable count
to stderr. Consumers that merge stdout/stderr (agent harnesses, CI
`2>&1`) saw it interleaved with the JSON array on `--output json`, and
in table mode it carried no information the table itself didn't.
The `Next thread cursor` / `Next reply cursor` lines stay — they're
real paging signals the agent runtime reads from stderr.
Closes#3303
MUL-2709
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(comments): clear editor immediately on submit to eliminate WS race visual glitch
The comment editor stayed populated while WebSocket delivered the new
comment faster than the HTTP response, causing a "duplicate comment"
flash. Move clearContent/setIsEmpty before the await so the editor clears
at click time. Also remove dead `submitting` state in useIssueTimeline
(redundant with the input components' own guards) and dead `isTemp` logic
in comment-card (no code path ever creates temp- prefixed entries).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(comments): preserve attachments on submit failure and fix CommentRow indentation
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
The "Board ordered by" overlay used absolute positioning inside a
scrollable container, causing it to drift with scroll content. Move
the overlay outside the scroll area into a non-scrolling wrapper so
it stays centered in the visible viewport.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
- Normalize nav/page/section titles to plural English (Issues/Skills/Tasks) per conventions.zh.mdx rules for section titles
- Lowercase 'Issue' inside UI short phrase '我的 Issue' (UI short-phrase rule)
- Translate concept words in GitHub settings (Connection/Features/Repositories/Done)
- Translate 'Cloud Runtime' to '云端运行时' to match runtime→运行时 glossary
Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(views): swimlane supports parent / project / assignee grouping (MUL-2711)
The swimlane view was hard-coded to group by parent issue. This adds a
display dropdown so users can pick parent (default), project, or
assignee — analogous to how the board view exposes its grouping option.
- Generalise the lane builder in swimlane-view.tsx behind a `LaneGroup`
abstraction (matcher + per-grouping `moveUpdates` payload) so the
drag-end handler no longer branches on grouping. Cell ids gain a
`<grouping>:<rawId>` prefix and lane sortable ids include the
grouping so dnd-kit cannot collide entries from different groupings.
- Extend the view store with `swimlaneGrouping`, `swimlaneOrders` (one
saved order per grouping), and a grouping-keyed `collapsedSwimlanes`.
The persist `merge` defends against the old `string[]` shape so a
pre-upgrade snapshot doesn't crash on first read.
- Wire `setSwimlaneGrouping` into the issues display popover next to
the existing board grouping control. Add en / zh-Hans copy for the
three swimlane buckets (Parent issue / Project / Assignee) and the
two new pinned lanes (No project / Unassigned).
- Expand swimlane tests with parent / project / assignee smoke cases
and update existing mocks to the new lane-id format. Add stable
`useActorName` / `projectListOptions` mocks to avoid the
set-state-in-effect loop that an unstable `getActorName` would
trigger via the cells-rebuild memo.
Co-authored-by: multica-agent <github@multica.ai>
* feat(views): default swimlane grouping to assignee
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
The MEMBERS column was hardcoded to "-" in the table output, so every
squad looked empty even though the backend already returns
`member_count` (and `member_preview`) on each row. `squad get --output
json` exposed the correct data, which is why the bug was cosmetic but
confusing.
Read `member_count` from the response and render it; fall back to "-"
when missing or zero so empty squads stay visually distinct.
Fixes#3304 (MUL-2706).
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Closes#3300.
After #2359 added canAccessPrivateAgent to chat, @mention, ListAgents,
GetAgent, history, edit, delete and issue assignment, one trigger path
was missed: shouldEnqueueOnComment. Once an owner/admin assigned a
private agent to an issue, the agent's UUID was "welded" onto that
issue and any workspace member who could view the issue could dispatch
a new task to it by posting a plain (non-@mention) comment — bypassing
the visibility gate the #2359 work was supposed to enforce.
Mirror the @mention path: plumb (authorType, authorID) from
CreateComment into shouldEnqueueOnComment, load the assigned agent, and
gate it with canAccessPrivateAgent before enqueueing. Add a Go
regression test on the existing privateAgentTestFixture covering the
plain-member, agent-owner, workspace-owner and agent-to-agent cases.
Co-authored-by: multica-agent <github@multica.ai>
#3265 already removed this blue "Importing creates a workspace copy..."
banner, but #3286 (the skills_local toggle revert) brought it back as
collateral. Re-remove it — this tab isn't where skill imports happen
(that lives behind Skills page → Add Skill → From Runtime), so the
callout is pure noise here.
Also flip the header row back to items-center now that the intro is
once again the only thing in it.
Fix Hermes ACP usage attribution to current model when agent.model is unset.
Also preserves cache-read token accounting and makes ACP model-list parsing more tolerant of snake_case payloads and Unknown display names.
Pin @xmldom/xmldom to ^0.8.13 in `pnpm.overrides` so every transitive
resolution (currently @expo/plist@0.5.3 and plist@3.1.0, both pulled
through expo) ships a patched build. All four lockfile entries move
from 0.8.12 to 0.8.13.
Closes the four high-severity advisories pnpm audit reports against
the prior 0.8.12 resolution:
- GHSA-2v35-w6hq-6mfw — uncontrolled recursion in serialization (DoS)
- GHSA-f6ww-3ggp-fr8h — XML injection via DocumentType serialization
- GHSA-x6wf-f3px-wcqx — node injection via processing-instruction
- GHSA-j759-j44w-7fr8 — node injection via comment serialization
Using `pnpm.overrides` (not a root direct dep) keeps the transitive
fix scoped to the dependency graph and avoids implying that the
multica codebase consumes xmldom directly.
Verification: `pnpm audit --prod --audit-level high` no longer lists
any @xmldom/xmldom advisories on this branch.
Co-authored-by: multica-agent <github@multica.ai>
* fix(server): gate GitHub auto-close on closing keywords (MUL-2680)
Closesmultica-ai/multica#3264. The PR webhook previously treated any
mention of an issue identifier in a PR title/body/branch as a close
intent, so a body of "Closes MUL-1. Follow up in MUL-2. Unblocks MUL-3."
would advance all three issues to done on merge. The auto-link layer
stays generous (mentions still link the PR), but advancing to done now
requires an explicit "Closes/Fixes/Resolves MUL-X" keyword adjacent to
the identifier in the title or body — bare title prefixes (`MUL-1: ...`)
and branch-name references no longer auto-complete.
MUL-2680
Co-authored-by: multica-agent <github@multica.ai>
* fix(server): persist close_intent on issue↔PR link rows (MUL-2680)
The first take of MUL-2680 gated auto-advance on `closingIdents[id]` from
the current webhook event. That broke the multi-PR sibling case: a PR
declaring `Closes MUL-X` could merge first while a link-only sibling
stayed open, leaving the issue in_progress; when the sibling closed
later, its webhook carried no closing keyword and the handler skipped
re-evaluation, so the issue stayed stuck forever.
Move close intent from per-event state to per-link state:
- New `close_intent` column on `issue_pull_request` (migration 109),
set monotonically — `LinkIssueToPullRequest` ORs the existing flag with
the incoming one so a subsequent webhook re-fire without the keyword
cannot clear it.
- New `GetIssuePullRequestCloseAggregate` query returns open-count and
merged-with-close-intent-count for an issue. The auto-advance gate
now reads from this persisted aggregate, which is event-agnostic: any
terminal linked-PR event re-evaluates and the verdict only depends on
accumulated DB state.
- Webhook handler links all mentioned identifiers first (writing
close_intent for the ones declared with a keyword), then iterates the
affected issues in a separate pass to re-evaluate. The 'only fires for
keyword-declared identifiers in this event' gate is gone — replaced by
`merged_with_close_intent_count > 0` against the link rows.
Regression test `TestWebhook_LinkOnlySiblingMergeAfterCloseKeywordPR`
walks the full open→merge→open→merge sequence Elon described and asserts
the issue advances on the link-only sibling's merge.
MUL-2680
Co-authored-by: multica-agent <github@multica.ai>
* Fix GitHub close intent updates
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Eve <eve@multica-ai.local>
The recent server-side-sort change (#3228) keyed the issue-list cache by
sort but did not update the load-more hooks: useLoadMoreByStatus used a
prefix-match that could pick a stale cache variant, and neither hook
forwarded sort to the API request. As a result, scroll-to-load-more
fired its request, but the response was either appended to a cache no
useQuery was subscribed to, or it appended rows in an unsorted order
into a sorted bucket.
Pass `sort` explicitly through Board/List/Swimlane and into the hooks.
The hook now targets the full sorted key via setQueryData and forwards
sort to the listIssues / listGroupedIssues calls so the appended page
lines up with the existing items.
Also adds focused tests for both load-more hooks: stale-sort cache is
untouched, sort is forwarded to the API, and sort-less callers still hit
the {} key path used by actor-issues-panel.
Co-authored-by: multica-agent <github@multica.ai>
* feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603)
Only Claude Code and Codex runtimes actually enforce `skills_local` at exec
time today — Claude isolates `~/.claude/skills/` via `CLAUDE_CONFIG_DIR`,
Codex isolates `~/.codex/skills/` via per-task `CODEX_HOME`. Every other
runtime currently stores the field but treats it as a no-op, which made
the toggle in the Create Agent dialog and Skills tab misleading for those
runtimes.
Gate the toggle on `runtime.provider` so it only renders for the providers
the daemon currently isolates. Centralise the supported-provider list as
`isSkillsLocalSupportedProvider()` in `packages/core/agents` and reuse it
from the create dialog and the Skills tab. The create dialog also drops
`skills_local` from the payload when the selected runtime is unsupported,
so a runtime swap can't leave a stale `ignore` opt-in pinned where it
would never take effect.
Docs (EN + ZH) updated to say the toggle is hidden — not just "a no-op" —
for the unsupported runtimes.
Co-authored-by: multica-agent <github@multica.ai>
* docs(agents): align skills_local hint and type comment with claude+codex boundary
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
The Usage / Runtime dashboards read from `task_usage_hourly`, but the
default self-host stack does not schedule `rollup_task_usage_hourly()`
anywhere — the bundled pgvector/pgvector:pg17 image ships without
pg_cron, and the backend does not run the rollup in-process. Fresh
installs see the dashboard stay at zero forever (#3244), and upgrades
from v0.3.4 → v0.3.5+ are blocked by migration 103's fail-closed guard
(#3015).
Document the three supported paths (external cron / systemd-timer /
CronJob, Postgres with pg_cron, or backfill_task_usage_hourly for
upgrades) across SELF_HOSTING.md, SELF_HOSTING_ADVANCED.md, the
quickstart pages on the docs site, and add troubleshooting entries
for both the silent-zero and the migration-guard failure modes.
Co-authored-by: multica-agent <github@multica.ai>
- Rename printDaemonStatusTable -> printDaemonStatusReport. The helper
emits a key/value list, not a table; the old name implied a tabular
layout that never existed and made the call site read wrong.
- Align the value column dynamically off the widest key. Previously the
spacing was hard-coded so the static rows (Version/Agents/Workspaces)
all landed at column 14, but the dynamic "Daemon [profile]" label
could outgrow that and push only its own value rightward, breaking
vertical alignment as soon as a profile was active.
- Add negative coverage for cli_version absent / empty (the real
back-compat contract for older daemons paired with a newer CLI) and a
test that asserts the value column lines up under a long profile
label.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): surface host OAuth token via env var on macOS isolation (MUL-2603)
Claude Code 2.x scopes the macOS keychain credentials entry by
sha256(CLAUDE_CONFIG_DIR)[:8], so the MUL-2603 isolation path strands
the child at "Not logged in" even after #3261 mirrored .claude.json:
the child looks up `Claude Code-credentials-<scratch-hash>`, the host
token is sitting in the no-suffix `Claude Code-credentials` entry.
Read the host OAuth token from the keychain via /usr/bin/security and
inject it as CLAUDE_CODE_OAUTH_TOKEN, which bypasses keychain lookup
entirely. Linux/Windows continue to use the .credentials.json mirror
(no-op there). Operator-pinned tokens and ANTHROPIC_API_KEY both take
precedence over the keychain reader.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): tighten empty-value auth gate, pin Claude CLI env-scrub assumption (MUL-2603)
Empty-value gate
- `ANTHROPIC_API_KEY=` inherited from a login shell that conditionally
exports auth previously posed as an "operator pinned API-key auth"
choice and disabled the keychain reader, stranding the isolated child
at "Not logged in" even though no auth was actually selected.
- Custom_env `CLAUDE_CODE_OAUTH_TOKEN=""` (stale agent config) had the
same effect, plus would have shadowed a keychain-injected token in
libc env lookups that pick the first match.
- Both are now treated as noise: the empty entry is dropped from the
child env and the keychain reader runs unchanged. Two new unit tests
cover the os.Environ side (`...TreatsEmptyAnthropicAPIKeyAsUnpinned`,
`...HonorsNonEmptyAnthropicAPIKey`) and the custom_env side
(`...EmptyOAuthTokenInCustomEnvAsUnpinned`).
Env-scrub boundary
- Surfacing `CLAUDE_CODE_OAUTH_TOKEN` to the isolated child is only
safe because Claude Code itself drops that variable from the env it
hands to Bash / hook subprocesses, so a model-driven `printenv` can
never echo the secret into the agent transcript.
- Empirically verified against `claude` 2.1.121:
printf '...test -n "$CLAUDE_CODE_OAUTH_TOKEN" && echo SET || echo UNSET...' \
| CLAUDE_CODE_OAUTH_TOKEN=sk-canary-XYZ \
MUL2603_CONTROL=control-value \
claude --print --output-format text \
--allow-dangerously-skip-permissions --allowedTools Bash
returned `UNSET` for the OAuth token while the non-sensitive
`MUL2603_CONTROL` control returned `CONTROL-SET`, proving the CLI
scrubs only the auth env, not the env in general.
- Pinned this assumption in a new skip-gated regression test
(`TestClaudeCLIScrubsOAuthTokenFromBashSubprocess`) that boots the
real CLI with a canary token; failing the test means upstream
Claude Code stopped scrubbing and the passthrough must move off env
vars before MUL-2603 can ship.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): gate keychain passthrough on default host dir, harden scrub test (MUL-2603)
Two follow-ups from the round-2 review on #3267:
1. Custom CLAUDE_CONFIG_DIR no longer pulls the default OAuth token.
Claude Code 2.x maps each config dir to its own suffixed
`Claude Code-credentials-<hash>` keychain entry, so an operator that
pins a managed/custom CLAUDE_CONFIG_DIR via custom_env or the
daemon-host env was getting the *daemon user's* default unsuffixed
entry injected into the isolated child — silently crossing accounts,
exactly the boundary mirrorHostClaudeJSONIfMissing already protects
for `.claude.json`. buildClaudeEnvWith now threads the effective
hostConfigDir through and only calls the reader when that dir is the
default `$HOME/.claude`. The new gate has a unit-level truth table
(TestIsDefaultHostClaudeConfigDir) plus a regression
(TestBuildClaudeEnvIsolatedSkipsKeychainForCustomHostConfigDir) that
makes a t.Fatal-armed reader prove the gate keeps the read off for
custom dirs.
2. Scrub e2e now asserts the control prong and the proof-of-execution
marker, not just "canary absent". The previous assertion would
false-pass on a model refusal, paraphrase, or "Bash gets no env at
all" upstream change. The strengthened version sets a non-secret
MUL2603_CONTROL alongside the canary OAuth token and asserts (a)
canary is NOT in the transcript, (b) CONTROL-SET IS in the
transcript (env propagation works for non-secrets — proves a
targeted scrub), (c) UNSET IS in the transcript (the Bash tool
actually ran AND saw the OAuth var as empty/unset). Code comment in
buildClaudeEnvWith and the test docstring now narrow the
security contract to the Bash tool subprocess only; hook subprocess
env-scrub is no longer claimed because it has not been verified.
Co-authored-by: multica-agent <github@multica.ai>
* test(agent): use per-run nonces in Claude scrub e2e to kill false-pass (MUL-2603)
Elon's round-3 review flagged that TestClaudeCLIScrubsOAuthTokenFromBashSubprocess
still false-passed: the proof markers "UNSET" / "CONTROL-SET" were literal
strings in the prompt, so strings.Contains matched them even when the model
only paraphrased the prompt without spawning Bash.
Replace the hard-coded markers with two per-run random hex nonces passed *only*
via env vars (MUL2603_UNSET_NONCE, MUL2603_CONTROL_NONCE). The prompt now
references the variable names, not the values, so the nonces can land in the
transcript only if a real Bash subprocess inherits the env vars and echoes
them. A paraphrasing or refusing model cannot fake nonces it never saw.
Also update the security-boundary comment in buildClaudeEnvWith to describe
the nonce-based proof.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* fix(server): trigger assignee on agent-driven backlog→active (MUL-2670)
The backlog→active transition was gated on `actorType == "member"`, which
silently dropped agent-driven promotions and broke the documented serial
sub-task workflow — a parent agent finishing Step 1 and promoting Step 2
from backlog→todo would never fire Step 2's assignee.
Replace the member-only gate with a self-promotion guard. Agent actors
now fire the same enqueue path as members; the only excluded case is an
agent promoting an issue assigned to itself (which would self-loop on
every run). Applied to both UpdateIssue and BatchUpdateIssues.
Adds two integration tests covering the documented serial-chain case and
the self-loop guard.
Co-authored-by: multica-agent <github@multica.ai>
* fix(server): scope backlog→active self-loop guard to the calling task's issue
The previous agent-id-only guard over-blocked same-agent serial chains:
if Agent A finished a task on issue I1 and promoted issue I2 from
backlog→todo, the promotion was silently dropped whenever I2 was also
assigned to A. Only the cross-agent handoff worked.
Replace the actor-vs-assignee check with a task-vs-issue check:
isAgentRunningOnIssue looks up the calling X-Task-ID and only blocks
when that task's issue_id matches the issue being promoted (the true
self-loop). Member actors and same-agent cross-issue promotions now
fire, including via BatchUpdateIssues.
Tests:
- TestBacklogToTodoByAgentSameIssueDoesNotSelfTrigger (true self-loop)
- TestBacklogToTodoByAgentSameAgentDifferentIssue (serial chain works)
- TestBatchBacklogToTodoByAgentTriggersAssignee (batch path)
- TestBacklogToTodoByAgentTriggersSquadLeader (squad branch)
Co-authored-by: multica-agent <github@multica.ai>
* test(server): seed running task in handler test helper to avoid collisions
createHandlerTestTaskForAgentOnIssue inserted with status='queued',
which broke two tests added by the same-issue self-loop guard:
- TestBacklogToTodoByAgentSameIssueDoesNotSelfTrigger asserted
`count(*) WHERE status='queued'` was 0, but the seeded task itself
showed up in the count → got 1.
- TestBacklogToTodoByAgentSameAgentDifferentIssue seeded a task for
the same (issue_id, agent_id) as step1's auto-enqueued queued task,
tripping idx_one_pending_task_per_issue_agent.
X-Task-ID semantically belongs to a currently-running task. Inserting
the seed with status='running' (and started_at=now()) keeps it outside
both the unique index and the queued-count assertions, so the tests
verify only what the handler does in response to the agent-driven
backlog→active promotion.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
multica issue status --help only documents <status> as a required
positional. Users have to discover the valid set via trial-and-error
(triggering 'Error: invalid status "X"; valid values: ...').
Add a Long description that lists the 7 valid statuses inline:
backlog, todo, in_progress, in_review, done, blocked, cancelled.
Pure docs change; no behavior changes.
Co-authored-by: Wington Brito <4412238+wingtonrbrito@users.noreply.github.com>
The previous system-comment wording ("promote any waiting `backlog`
sub-issues") let a planner agent flip every backlog sibling to `todo` on
the first child-done signal, ignoring per-sibling stated dependencies.
Tighten the prompt so the agent must read each sibling's description,
only promote items whose dependencies are satisfied, and leave the
status alone (and comment to confirm) when the parent's higher-level
breakdown conflicts with what a sibling lists as a prerequisite.
This is the short-term mitigation; a structured `blocked_by` edge is
out of scope here and will be designed separately.
Co-authored-by: multica-agent <github@multica.ai>
* feat(runtimes): cascade-archive agents on runtime delete (MUL-2667)
Replace the bare 409 "cannot delete runtime: it has active agents" with a structured response carrying the blocking agent list, and wire a cascade endpoint that archives those agents, cancels their tasks, pauses dangling autopilots and deletes the runtime in a single transaction. The unified DeleteRuntimeDialog opens directly in cascade mode when the runtime has bound agents, pivots from light to cascade if the strict DELETE refuses with runtime_has_active_agents, and re-prompts when the cascade refuses with runtime_delete_plan_changed (live agent set drifted while the dialog was open). The online-local self-healing rule is preserved at the affordance level (kebab hidden, Diagnostics button disabled with tooltip) and re-checked at confirm time as defence in depth.
Co-authored-by: multica-agent <github@multica.ai>
* fix(runtimes): close cascade race + i18n delete dialog (PR #3266 review)
- Acquire FOR UPDATE on the runtime row at the top of the cascade tx so
FK-validated agent INSERTs/UPDATEs that would point at this runtime
block until commit, and lock each currently-active agent row via
ListActiveAgentsByRuntimeForUpdate so a concurrent archive/move of
an existing active row also blocks.
- Switch the bulk archive from runtime-keyed (ArchiveAgentsByRuntime)
to ID-keyed (ArchiveAgentsByIDs), narrowed to the user-confirmed
expected_active_agent_ids set. Combined with the runtime row lock,
this guarantees no agent outside the confirmed plan can be silently
archived between plan-compare and archive even at read-committed.
- Wire delete-runtime-dialog.tsx to runtimes locale via useT(); add
detail.delete_dialog.{light,cascade} keys (EN with _one/_other
plurals, zh-Hans _other) covering titles, descriptions, warning,
notices, checkbox, buttons, table headers, presence labels, and
toasts. Resolves the i18next/no-literal-string CI failure.
- Locale parity test passes (51 tests). All 4 dialog test cases pass
unmodified (EN copy preserves original wording). Full views vitest:
91 files / 792 tests green; full server go test: green.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
Three small UI cleanups on the agent Skills tab:
- The blue "Importing creates a workspace copy that your team can edit
and reuse" callout was visual clutter — drop it (and the Info icon
import that it relied on).
- The intro paragraph conflated two things: the workspace-skills concept
(applies to every runtime) and the Allow-locally-installed-skills
toggle (only honoured by Claude Code and Codex; verified — none of
copilot/cursor/gemini/opencode/openclaw/hermes/pi/kimi/kiro read
agent.SkillsLocal). Rewrite the intro to only describe the main
concept; the toggle's own local_hint_on/off strings still carry the
Claude/Codex caveat where it belongs.
- The trimmed intro now fits one line, so flip the header row from
items-start to items-center so the text sits on the same baseline as
the "Add skill" button instead of clinging to its top edge.
* fix(daemon/execenv): refresh stale Codex config copies across env reuse (MUL-2646)
`copyFileIfExists` previously short-circuited whenever the per-task
`codex-home/{config.toml,config.json,instructions.md}` already existed,
so once the files were seeded at first Prepare they were never refreshed
again — even though `Reuse()` calls `prepareCodexHomeWithOpts` on every
resume. A user who rotated their Codex `~/.codex/config.toml` between
runs (e.g. switching the active `[model_providers.X]` `base_url`, or
pointing `env_key` at a freshly rotated API key) kept reading the stale
per-task copy on session resume. Codex then issued requests to the new
URL using the old key and the API rejected the token.
Treat any existing `dst` as something to drop and re-copy from the
current shared source, mirroring the symlink path that already refreshes
`auth.json` (#2126). The daemon-managed sandbox / multi-agent / memory
blocks are applied via marker-bracketed idempotent passes after the
copy, so a re-copy + re-ensure cycle preserves them.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon/execenv): drop per-task Codex copy when shared source removed (MUL-2646)
Extend the MUL-2646 fix to the deletion arm of "sync the shared source":
`syncCopiedFile` (renamed from `copyFileIfExists`) now also removes the
per-task `dst` when the shared `src` is absent. The prior version
short-circuited on missing src and left `config.toml` / `config.json` /
`instructions.md` from the previous Prepare lingering in the per-task
home — so a user who removed a provider by deleting `~/.codex/config.toml`,
or pulled `config.json` / `instructions.md` out of the shared home, would
keep replaying the stale copy on session resume.
For `config.toml` the subsequent `ensureCodex{Sandbox,MultiAgent,Memory}Config`
passes recreate the file with only the daemon-managed default blocks, so
removing the shared file cleanly drops every user-managed
`[model_providers.X]` / `model_provider` line. For `config.json` and
`instructions.md` there is no daemon default, so they disappear in
lockstep with the shared source.
Adds `TestPrepareCodexHome_DropsCopiedConfigWhenSharedSourceRemoved`
covering the new path, and extends the refresh-arm test to assert the
multi-agent / memory marker blocks are still present after the copy is
refreshed.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* refactor(editor): split rich text styles
* feat(issues): server-side sort + fix drag position corruption in non-manual sort
Backend: ListIssues and ListGroupedIssues now accept `sort` and `direction`
query params (position/priority/title/created_at/start_date/due_date).
ListIssues converted from sqlc to hand-written SQL for dynamic ORDER BY.
Priority sort uses CASE expression for semantic ordering.
Frontend: query keys include sort so changing sort triggers server refetch.
Client-side sortIssues() removed from board-view and list-view.
Drag-and-drop: non-manual sort disables within-column reorder (prevents
silent position corruption). Cross-column drag only updates status/assignee,
preserves original position. Column overlay shows current sort during drag.
Cache: query key split into prefix (list) for invalidation and full key
(listSorted) for queryOptions. All optimistic update paths use prefix
matching via getQueriesData to work with any active sort.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(board): prevent drag flicker by settling columns until mutation refetch
After drag-and-drop, the optimistic cache patch updates position values
without reordering the bucket array. The useEffect that rebuilds columns
from TQ data would overwrite the correct local drag order, causing cards
to snap back then forward. Fix: isSettlingRef blocks column rebuilds
between drag end and mutation onSettled.
Also invalidate issueKeys.list on WS position changes so other windows
refetch correctly sorted data instead of showing stale bucket order.
Includes debug logs (to be removed after verification).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(board): stabilize drag-and-drop for non-manual sort modes
Three behavioral fixes for board drag when sort != position:
1. Settling: isSettlingRef + settleVersion blocks column rebuilds
between drag-end and mutation settle, preventing the optimistic
cache patch (which updates position values without reordering the
bucket array) from overwriting the correct local column state.
2. Non-manual cross-column: handleDragOver returns prev (no visual
card movement — column highlight + sort label is sufficient).
handleDragEnd uses overCol directly instead of findColumn on the
card's current position (which would be the source column).
Cards use useSortable({ disabled: { droppable: true } }) to
suppress within-column insertion indicators.
3. Collision detection: when no card droppables exist (disabled in
non-manual sort), return column droppables from pointerWithin
instead of falling through to closestCenter, so isOver reflects
the column the pointer is actually inside.
Also: WS position changes now invalidate issueKeys.list so other
windows refetch correctly sorted data.
Insertion-position prediction intentionally omitted — PostgreSQL's
en_US.utf8 collation (glibc) cannot be faithfully replicated in
JavaScript (ICU/V8), and an inaccurate indicator is worse than none.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(sort): manual sort ignores direction param on both ends
Manual sort (position) is user-defined order via drag-and-drop —
reversing it has no product meaning.
Backend: sort=position now skips the direction query param and
always uses ASC. Both ListIssues and ListGroupedIssues handlers.
Frontend: sort object omits sort_direction when sortBy is position.
Direction toggle hidden in the display popover for manual mode.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(board): memo columns + stabilize references to reduce re-renders
- BoardColumn, PaginatedBoardColumn, PaginatedAssigneeBoardColumn
wrapped in memo() — only columns with changed props re-render
- IssueAgentActivityIndicator wrapped in memo() — 111 snapshot
subscribers no longer trigger full re-render on every WS task event
- buildColumns rewritten from O(groups × issues) to single-pass O(n)
- EMPTY_IDS constant replaces ?? [] fallbacks (stable reference)
- EMPTY_CHILD_PROGRESS constant replaces new Map() default
- BOARD_COL_WIDTH / BOARD_CARD_WIDTH constants shared between
column and DragOverlay for consistent card dimensions
- issueListOptions + issueAssigneeGroupsOptions use
placeholderData: keepPreviousData so sort/filter changes don't
flash a full-page skeleton
- Loading skeleton scoped to content area only — header stays
rendered during data transitions
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: remove outdated server-side sort implementation plan
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #3200 introduced per-agent `skills_local=ignore` isolation that
mirrors the host's Claude config dir into a per-task scratch dir,
omitting `skills/` to keep broken local skills out of the CLI's
discovery path. The mirror walks entries inside `hostConfigDir`
(default: `$HOME/.claude/`), but Claude Code's default layout stores
its main config — login state, project history — at
`$HOME/.claude.json`, a *sibling* of `~/.claude/` rather than inside
it. Once `CLAUDE_CONFIG_DIR=$ISOLATED` is set, the CLI looks for
`$ISOLATED/.claude.json`, finds only `backups/.claude.json.backup.*`
(those live inside `~/.claude/` and DO get mirrored), and exits with:
Claude configuration file not found at: …/.claude.json
Not logged in · Please run /login
— so every agent with `skills_local=ignore` on a host using the
default Claude layout dies on the first turn. Flipping the toggle back
to "merge" restores the host CLAUDE_CONFIG_DIR and recovers the agent;
that's the workaround Bohan flagged in MUL-2661.
Fix: after the existing `mirrorHostClaudeExceptSkills`, run a new
`mirrorHostClaudeJSONIfMissing` that pulls `$HOME/.claude.json` into
the scratch dir as `.claude.json` when (a) the dest doesn't already
have one and (b) the host source dir is the default `$HOME/.claude/`.
The custom-CLAUDE_CONFIG_DIR path is left alone because a pinned
custom dir is expected to be self-contained — silently borrowing
`$HOME/.claude.json` from a different account would mask credential
drift.
The helper goes through `createFileLink`, so it inherits the same
symlink → junction → hardlink → copy fallback chain the rest of the
mirror uses on Windows-without-Developer-Mode hosts.
Tests:
- `TestMirrorHostClaudeJSONIfMissing_DefaultLayoutMirrorsParentFile`
covers the happy path with an injected `homeDir`/`fileLink`.
- `TestMirrorHostClaudeJSONIfMissing_AlreadyPresentNoop` asserts a
pre-existing dest `.claude.json` (from a custom CLAUDE_CONFIG_DIR
mirror) is not overwritten.
- `TestMirrorHostClaudeJSONIfMissing_CustomHostDirSkipped` locks in
the custom-host-dir gate.
- `TestMirrorHostClaudeJSONIfMissing_MissingSourceNoop` documents the
env-var-auth-only / fresh-install case.
- `TestClaudeExecuteIsolatesProvidesClaudeJSONFromHome` is the
end-to-end MUL-2661 regression: a fake `\$HOME` with the default
split layout, `skills_local=ignore`, fake claude binary that prints
whatever `.claude.json` reaches the scratch dir. Asserts the file
rides through. Verified the test fails (with the documented
MUL-2661 error message) when the new mirror call is removed.
Verification:
- `go test ./pkg/agent/...` green (full agent suite).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` clean.
Co-authored-by: multica-agent <github@multica.ai>
* feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603)
Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so
shared agents stop inheriting the operator's user-global Claude skill
directory. A single broken local skill on one operator's machine was
crashing the Claude CLI before it ever read stdin — the daemon saw a
"broken pipe" with no recoverable signal (GitHub #3052).
- DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'),
with sqlc CreateAgent/UpdateAgent updates and handler validation.
- Claude runtime: when the agent is in "ignore" mode the backend points
CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd
(fallback: OS temp), strips any inherited override, and cleans up after
the run. Workspace skills under `{cwd}/.claude/skills/` still load.
"merge" preserves the legacy inherit-from-machine behavior; Codex and
other isolated backends are no-ops.
- UI: new Skills toggle in the Create Agent dialog and the Agent → Skills
tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two.
- Tests: unit coverage for the new env helper, isolation dir lifecycle,
full Claude execute paths (ignore + merge), and the handler tristate
contract. Existing skills-tab test updated for the new copy.
- Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry
in the landing-page i18n.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): preserve claude login + validate skills_local input (MUL-2603)
Address Elon's review on PR #3200:
1. Skill isolation no longer drops the operator's Claude login. The
per-task scratch dir now mirrors every entry under `~/.claude/`
as symlinks except `skills/`, so `.credentials.json`, settings,
plugins, etc. reach the CLI exactly as on the host while the
user-global skills directory stays hidden. Without this, default
`ignore` would have broken every Claude agent on a non-API-key
host the moment migration 108 landed.
2. Internal CreateAgent callers (agent_template, onboarding_shim)
now set `SkillsLocal: "ignore"`. The Go zero value was about to
trip the migration-108 CHECK constraint and 500 template /
onboarding agent creation.
3. Create / update handler validation no longer normalizes garbage
to "ignore". The strict 400 path is now reachable on bad client
input; the drift-safe `normalizeSkillsLocal` stays on the read
side only.
UI copy + docs clarified that the toggle is Claude-only; other
runtimes ignore the setting.
Verification:
- `go test ./...` green (full suite locally).
- `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green.
- Handler DB-backed tests still skip locally without docker (same
as Elon's run) — CI will validate the create / update paths
against migration 108.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): mirror effective claude config dir with windows fallback (MUL-2603)
Address Elon's second-round review on PR #3200:
1. The per-task scratch dir now mirrors the *effective* host Claude
config dir, not unconditionally `~/.claude/`. Precedence: agent
`custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`.
Without this, an operator who pinned Claude at a managed install
(custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in
the scratch dir, because `buildClaudeEnv` strips that env before
handing it to the child. We resolve the source up front and feed
it to the mirror, so the override env still points at the right
bytes.
2. Mirror entries now go through platform-aware linkers. On Windows
without Developer Mode / admin, `os.Symlink` is denied, which
previously left the scratch dir empty and broke Claude Code auth
on default `ignore`. The new helpers try symlink first, then fall
back to a directory junction (`mklink /J`) for dirs or a hardlink
(same-volume content share) / copy for files. Mirrors the
execenv/codex_home_link_windows.go pattern.
3. Tests:
- `TestResolveHostClaudeConfigDir` locks in the custom_env >
parent_env > `~/.claude` precedence.
- `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms
the scratch dir picks up `.credentials.json` from a synthetic
custom host dir, proving the source resolution actually
propagates into the mirror.
- `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the
env-var-auth-only case (no host source ⇒ empty scratch dir).
- `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails`
exercises the Windows-no-Developer-Mode path via the new
`mirrorHostClaudeExceptSkillsWith` seam, asserting credentials
and sub-dir children still reach the scratch dir after the
symlink stand-in fails.
- `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError`
confirms callers see the per-entry error when even fallback
fails (so the warn-log fires on broken Windows installs).
- `TestCopyFileRoundTrip` covers the last-resort copy fallback
and its EXCL no-overwrite contract.
- `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the
end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR
reads its credentials from the pinned dir, not `~/.claude/`.
4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to
describe the effective-source resolution and the Windows
fallback chain so the docs match the runtime behaviour.
Verification:
- `go test ./...` green (full server suite locally, including
`pkg/agent` 23 cases covering the new + existing isolation
paths).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
`go test -c -o /dev/null` both compile clean, confirming the
Windows-tagged linker file builds.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603)
Per Bohan's product decision on PR #3200, the per-agent host-skill toggle
defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior —
so existing personal workflows that rely on locally installed Claude
Skills keep working unchanged. Agent owners explicitly opt into "ignore"
when they need to harden a shared agent against a broken local skill on
one operator's machine (GitHub #3052).
Also audited all 11 runtimes for user-global skill discovery paths and
documented the scope of the toggle. Only Claude reads a user-global
`~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends
(Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor /
Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir
and never read a user-global skill directory. UI copy and docs now say
"for runtimes that support it (currently Claude Code)" everywhere so
the scope is explicit.
Changes:
- Migration 108: column default flipped to 'merge'.
- Handler CreateAgent: missing field → "merge"; explicit "ignore" /
"merge" still validated, garbage still 400.
- normalizeSkillsLocal: drift-safe coercion now lands on "merge" for
anything that isn't the exact literal "ignore".
- agent_template.go / onboarding_shim.go: internal CreateAgent callers
send "merge" instead of "ignore" to match the new default.
- Claude runtime (`claude.go`): isolate-mode gate flipped from
`SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy
daemons / older clients) and "merge" both walk `~/.claude/` directly.
- Create Agent dialog + Skills tab: toggle defaults to on (merge); only
duplicate of an explicit "ignore" agent carries through. The
isolation opt-in is now `skills_local: "ignore"` when the user flips
off; "merge" is omitted from the request body.
- i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off —
ignored. Recommended for shared agents".
- Docs (`/skills`, `/guides/agents.zh`): describe new default and
enumerate which runtimes act on the toggle.
- Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note
the on-by-default behavior + off-to-isolate framing.
- Tests:
- `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the
old by-default isolation case (now requires explicit "ignore").
- New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that
default ExecOptions preserve the host CLAUDE_CONFIG_DIR.
- `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts
into "ignore" mode.
- Handler tests: omitted → "merge"; explicit "ignore" round-trips;
preserve-existing test seeds "ignore" and asserts "merge" flip-back.
- `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore"
maps to ignore; everything else → "merge".
- `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent
opted into "ignore". Intro-text matcher anchored to a more specific
phrase so it no longer collides with the toggle hint copy.
Verification:
- `go test ./...` green (full server suite locally).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
`go test -c -o /dev/null` both compile clean (windows-tagged linker
file still builds).
- `pnpm typecheck` green across all packages and apps.
- `pnpm --filter @multica/views test` 88 files / 771 tests green.
- `pnpm --filter @multica/core test` 43 files / 390 tests green.
- Handler DB-backed tests still skip locally without docker; CI will
validate the create / update paths against migration 108.
Co-authored-by: multica-agent <github@multica.ai>
* chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603)
The landing-page release notes belong in a separate release-prep PR, not in the feature PR.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603)
Make the per-agent skills_local toggle real for Codex too, not just Claude.
Previously the toggle was only consumed by the Claude backend, while the
daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the
host machine's user-installed skills from ~/.codex/skills/. A shared Codex
agent with skills_local=ignore could still inherit a broken local skill
from one operator's machine.
Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills
skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task
CODEX_HOME exposes only workspace skills to the codex CLI. Default
("merge", or empty from older servers/clients) preserves existing
inherit-from-machine behavior. UI / docs are updated to reflect the
contract honestly: Claude Code and Codex honor the toggle; other
runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi /
OpenCode / OpenClaw) leave $HOME untouched and discover user-level
skills natively, so the toggle is a no-op for them today.
New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed,
TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and
TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore),
Prepare(merge), and the toggle-flip-on-reuse path.
Co-authored-by: multica-agent <github@multica.ai>
* docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603)
Off-state hint and Skills tab intro now explicitly call out Claude Code +
Codex as the only runtimes that honor the toggle, with "other runtimes
ignore this setting" wired into both states (en + zh-Hans), so users on
non-Claude/Codex agents don't read "Off" as runtime-wide isolation.
Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing
Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro
as having native user-level skill discovery; the daemon simply does not
manage user-level skill discovery for those runtimes today, and the toggle
is a no-op regardless of where it is set.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
When no assignee was set, the entire meta row (assignee + dates + child progress) could disappear because showAssignee required both storeProperties.assignee AND issue.assignee_id. Now the row visibility depends only on storeProperties.assignee, and unassigned cards show "Unassigned" text with a clickable picker to assign.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(deps): add eslint phantom dep detection + fix existing violations (MUL-2654)
Introduce eslint-plugin-import-x/no-extraneous-dependencies rule to
prevent phantom deps from causing production build splits when pnpm
creates peer-dep variants. Fix all existing phantom deps across the
monorepo, unify catalog references, and enable desktop smoke CI on PRs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* revert(ci): remove desktop smoke PR trigger per user feedback
The existing smoke workflow only verifies packaging completes — it does
not actually start the app or check rendering. This means it wouldn't
have caught the white-screen bug (which was a runtime issue, not a build
failure). Adding it to PRs would slow CI without providing meaningful
protection. The ESLint no-extraneous-dependencies rule is the actual
prevention mechanism.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(deps): sync pnpm-lock.yaml for rehype-sanitize dep classification
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(ui): move rehype-sanitize to deps + declare eslint-config (MUL-2654)
- Move rehype-sanitize from devDependencies to dependencies (used in
production Markdown.tsx)
- Add @multica/eslint-config to devDependencies (imported by
eslint.config.mjs but previously undeclared)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* feat(views): add sticky positioning to list-view group headers
Group headers now stay pinned at the top of the scroll viewport so users
always know which status group they are looking at. Background changed
from semi-transparent to opaque to prevent content bleeding through.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(views): remove top padding from list-view scroll container for sticky headers
The `p-2` padding on the scroll container caused an 8px gap above sticky
group headers. Replace with `px-2 pb-2` to keep horizontal and bottom
padding while allowing headers to stick flush to the top. Sync skeleton
containers in issues-page and my-issues-page to match.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(views): use p-2 pt-0 instead of px-2 pb-2 for list-view scroll container
Reporter preferred adding pt-0 to override the top padding from p-2,
keeping the original p-2 shorthand intact.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(views): opaque sticky header hover + cursor-pointer on trigger
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(chat): unify expand and drag-to-max rendering so both produce same dimensions (MUL-2653)
Expand button used CSS `inset-3` (parent minus 24px each side) while
drag-to-max used explicit 90%-of-parent pixel dimensions — different
sizes for the same conceptual state. Expand also hid resize handles,
preventing drag-back. Now both paths render with explicit width/height
at bottom-right and resize handles stay visible in all states.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* fix(chat): animate width/height via framer-motion for smooth expand toggle (MUL-2653)
Move width/height from style prop into animate prop so framer-motion
interpolates size changes. Remove layout="position" which only tracked
position. Drag uses duration:0 for instant feedback.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
Packaged renderer was bundling two copies of @tanstack/react-query because
apps/desktop imported it without declaring the dep, so Node resolution fell
through to the hoisted root variant (react@19.2.0, pulled in by apps/mobile),
while packages/core resolved to the catalog variant (react@19.2.3). Two physical
paths → two QueryClientContexts → "No QueryClient set" white screen on launch.
- Declare @tanstack/react-query, lucide-react, zustand as direct deps via catalog:
so apps/desktop resolves to the same peer variant as packages/core/views.
- Add @tanstack/react-query to renderer dedupe as a defense-in-depth bound
against future peer drift.
Verified: realpaths under apps/desktop, packages/core, packages/views all point
to @tanstack+react-query@5.96.2_react@19.2.3; production renderer bundle now
contains exactly one "use QueryClientProvider to set one" string (was 2) and
no useQueryClient\$1 suffix.
Co-authored-by: multica-agent <github@multica.ai>
apps/web postinstall runs fumadocs-mdx, which reads
apps/web/source.config.ts. The deps stage only copied
package.json files, so `pnpm install --frozen-lockfile`
failed with "Could not resolve /app/apps/web/source.config.ts"
and blocked the GHCR multica-web image build in the v0.3.7 release.
Co-authored-by: multica-agent <github@multica.ai>
In the trailing activity block's default truncated state ("last 8 shown,
N older hidden"), we were rendering two stacked chevron rows: a "v N
activities" collapse header and a "> Show N more activities" reveal link.
Visually that looked like nested folds even though they're parallel
controls, and the header is redundant when the user just wants a glance
at recent activity.
Drop the header in the truncated default state. It reappears the moment
the user clicks "Show N more" — at that point they're seeing the full
block and a fold-back affordance becomes useful again. Blocks that fit
within the 8-entry limit (and non-trailing blocks, which never truncate)
keep their header as before.
* feat(issues): truncate trailing activity block to most recent 6 (MUL-2628)
The trailing activity block defaults to expanded, but a block with dozens
of entries still drowns the comment area. Show only the most recent 6 by
default; older entries fold behind an in-place "Show N more activities"
toggle. Non-trailing blocks are unchanged — they still collapse whole.
The "show older" choice is tracked per block id in a separate Set so it
survives the block losing its trailing position (when a new comment
lands after it) and survives a collapse/re-expand cycle.
Co-authored-by: multica-agent <github@multica.ai>
* refactor(issues): bump trailing activity block visible limit from 6 to 8 (MUL-2628)
User feedback on the original PR: 6 felt slightly too tight. Bumped the
trailing-block truncation threshold to 8 entries to give the "most recent
activity" view a bit more headroom before older entries fold behind the
"Show N more activities" toggle.
Test count is unchanged; the existing trailing-block / non-trailing-block
truncation cases were adjusted to exercise the new 8-entry boundary
(10-entry trailing block → 2 hidden; 8-entry trailing block → none
hidden; 10-entry non-trailing block → all visible after expand).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600)
The agent resource shape (list / get / create / update / archive /
restore responses + WebSocket events) no longer carries `custom_env`
values. Reads/writes of env now flow exclusively through a dedicated
`/api/agents/{id}/env` endpoint that is owner/admin-only, rejects
agent-actor sessions, applies a "****" sentinel preserve guard on
PUT, and writes a persistent audit row per reveal/update.
Why
- `multica agent list --output json` historically returned plaintext
`custom_env` for owner/admin callers (the redaction gate gave only
members the masked map). Any agent token running on the workspace
inherits its owner's role and could read every other agent's
secrets just by listing.
- Patching list/get redaction alone (PR #3175 direction) left
symmetric leaks via mutation responses, WS events, the "reveal"
path itself (no actor-aware auth), and a `****` overwrite footgun
on UpdateAgent.
What changed
- Backend: drop `custom_env` from AgentResponse; add coarse
`has_custom_env` + `custom_env_key_count`. Strip env handling from
UpdateAgent (silently ignored if sent). Keep CreateAgent's
custom_env acceptance.
- Backend: new GET/PUT `/api/agents/{id}/env` handlers in
`internal/handler/agent_env.go`:
- resolveActor → 403 for agent actors (closes the lateral-movement
path).
- Owner/admin role gate via existing helper.
- PUT honours value == "****" as "preserve existing value".
- Both write to `activity_log` with `agent_env_revealed` /
`agent_env_updated` actions. Audit details record key names only,
never values.
- Daemon claim path (`ClaimAgentTask`) unchanged — `TaskAgentData`
still carries plaintext env for runtime injection.
- SQL: new `UpdateAgentCustomEnv` query; sqlc regenerated (v1.31.1).
- CLI: new `multica agent env get|set` subcommands. `--custom-env*`
flags removed from `multica agent update`; the no-fields error
now points to the new path.
- Frontend: drop env fields from `Agent` + `UpdateAgentRequest`; add
`getAgentEnv` / `updateAgentEnv` client methods; rewrite env-tab
to show "N variables configured" + explicit "Reveal & edit"
button, fetching values only on intentional reveal.
- Locales: parity-safe additions to en + zh-Hans.
- Docs: agents-create.{mdx,zh.mdx} reflect the new threat model and
endpoint.
- Mobile: schema drops `custom_env` / `custom_env_redacted`, adds
metadata fields.
Tests
- Handler tests pinned the new invariants: no env in list/get
responses, owner reveal happy-path + audit row, agent-actor 403,
`****` sentinel preserves real values, UpdateAgent silently
ignores `custom_env`, pure `mergeAgentEnv` cases.
- CLI tests pivot to the new flag surface: `agent update` MUST NOT
expose the env flags; `agent env set` MUST expose
--custom-env-stdin/--custom-env-file.
- Frontend test fixtures updated; pnpm typecheck / test / lint
pass cleanly.
This is a breaking API change. Scripts that read `custom_env` from
`/api/agents` must migrate to `GET /api/agents/{id}/env`.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agents): close actor-spoofing + audit fail-closed in env endpoints (MUL-2600)
Addresses Elon's review of #3209:
* Mint a task-scoped `mat_` token per claim, bound to (agent, task,
workspace, owner). Daemon injects it into the agent process in place
of its own credential. Auth middleware authoritatively rebuilds
X-User-ID / X-Agent-ID / X-Task-ID from the token row and sets
X-Actor-Source=task_token; that header is server-set only — incoming
values are stripped before any auth branch runs. resolveActor honors
the header so an agent that strips X-Agent-ID / X-Task-ID still
resolves as actor=agent.
* GetAgentEnv / UpdateAgentEnv are now fail-closed on audit-log
failures: GET refuses to return plaintext, PUT persists inside the
same tx as the audit row so they commit/roll back together.
* PUT /api/agents/{id} returns 400 when the body carries custom_env
instead of silently dropping it — directs callers to the audited env
endpoint.
* Agent actors never see mcp_config, even when the underlying member
is owner/admin; mutation broadcasts go through a redaction shim so
WS subscribers don't pick it up either.
* Fix backend test that asserted dense JSON (jsonb::text renders
whitespace) and frontend test that assumed a unique "Test User"
match.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agents): close residual MUL-2600 gaps from review (MUL-2600)
Migration 108 FK now correctly references agent_task_queue(id) instead
of the non-existent agent_task table; the previous name blocked CI
backend migrations.
Task-token-authenticated requests can no longer be re-routed at a
different workspace by passing workspace_slug / workspace_id /
?workspace_id / a URL workspace param. ResolveWorkspaceIDFromRequest
and resolveWorkspaceUUID both short-circuit on X-Actor-Source=task_token
and return only the token-bound X-Workspace-ID; buildMiddleware adds a
defence-in-depth 403 if any URL-resolved workspace disagrees with the
token binding.
mcp_config no longer leaks back to agent actors through UpdateAgent /
CreateAgent / ArchiveAgent / RestoreAgent HTTP responses — the same
redactAgentResponseForActor helper that GetAgent/ListAgents use is now
applied to mutation responses too. WS broadcasts were already redacted
via broadcastAgentResponse.
FailTask and every TaskService cancel path (CancelTask /
CancelTasksForIssue / CancelTasksForAgent / CancelTasksByTriggerComment
/ BroadcastCancelledTasks) now eagerly DeleteTaskTokensByTask so the
mat_ token's 24h window doesn't outlive a terminated task. Failure is
non-fatal — the FK cascade and expiry remain durable guards.
Doc-only: clarify that PUT /api/agents/{id} now hard-rejects bodies
that carry custom_env (was previously "silently ignores").
Tests:
- middleware: TestResolveWorkspaceIDFromRequest gains a task_token
case asserting client-supplied slug/id/query cannot override the
bound workspace.
- handler: TestUpdateAgent_RedactsMcpConfigForAgentActor and
TestUpdateAgent_KeepsMcpConfigForMemberActor pin the mutation-
response redaction contract per actor type.
Co-authored-by: multica-agent <github@multica.ai>
* fix(agents): match redacted mcp_config as JSON null, not Go nil (MUL-2600)
`AgentResponse.McpConfig` is `json.RawMessage` without `omitempty`, so
the redacted response serialises as `"mcp_config": null`. On decode,
`json.RawMessage` keeps the literal bytes `null` rather than collapsing
to Go nil, which made the assertion fire on a non-leak.
The product contract (field always present, distinguished from "no
config" via `mcp_config_redacted`) is intentional, so adjust the test
to check for "no secret-bearing content" instead of weakening the
contract via `omitempty`.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
When a user explicitly @-mentions an agent on an issue assigned to a squad,
the existing rule already suppresses the squad leader on the mention
comment itself — the user is routing deliberately, the mentioned agent
owns the next step. The leader was still woken on the agent's reply,
though, so it would re-@ the user every time the agent answered.
Extend the suppression to the second leg of that explicit exchange:
when an agent reply lands as a child of a member comment that carried a
routing @mention (agent/member/squad/all — issue cross-refs still
ignored), the leader stays out. The CreateComment handler already pins
agent parent_id == task.TriggerCommentID, so this fires exactly when
the agent's reply is provably tied to the upstream routing comment.
Top-level agent comments and agent-to-agent threads continue to wake
the leader so coordination keeps working everywhere else.
Co-authored-by: multica-agent <github@multica.ai>
Follow-up to #3196. Switching tabs and back on a long issue still landed at
scrollTop=0 because issue-detail uses Virtuoso with customScrollParent —
Virtuoso wires its scroll/resize observers in a passive useEffect, which
fires *after* useLayoutEffect. So at the moment the restore hook ran, the
spacer that gives the scroll container its tall scrollHeight hadn't been
re-established yet (scrollHeight === clientHeight), and the browser
silently clamped `scrollTop = saved` down to 0.
Diagnostic console output confirmed this:
marker key=true saved=10356.5 currentScrollTop=0 scrollHeight=750 clientHeight=750
→ set scrollTop to 10356.5 actually now 0
Fix: keep the synchronous set as the fast path, then if the assignment was
clamped, retry across rAF frames for up to ~500ms (30 frames at 60fps).
That gives Virtuoso's passive effect time to re-establish the spacer, after
which the next tick succeeds. Cancel any in-flight retry when the effect
tears down (Activity hidden again or component unmount).
Existing 4 tests in use-tab-scroll-restore.test.tsx still pass — the
synchronous fast path covers the simple-content case they exercise. A
jsdom regression for the Virtuoso scenario didn't reproduce reliably (the
clamp + rAF interplay needs a real browser), so this relies on manual
verification: open issue-detail, scroll deep into comments, switch tabs,
switch back — scroll position now holds.
Closes#3183.
Tabs render under `<Activity mode="visible|hidden">`, which keeps React
state but drops DOM scrollTop when the subtree leaves layout. Switching
to another tab and back sent users to the top of long discussions.
`useTabScrollRestore` records the scrollTop of every element marked with
`data-tab-scroll-root` while the tab is visible (capture-phase scroll
listener) and restores them in a useLayoutEffect on the next visible
transition, before paint. Saved offsets are dropped when the tab's path
changes so intra-tab navigation lands at scroll=0 instead of inheriting
the previous route's position.
Mark scroll containers in views with `data-tab-scroll-root` (issue
detail + chat message list ship with the marker; other views can adopt
the convention as needed).
`useAutoScroll` previously called `scrollToBottom()` on every effect
mount, which would have overwritten the restored offset every time a
chat tab cycled back to visible. Guard it with a once-per-instance ref.
Co-authored-by: multica-agent <github@multica.ai>
Pi reads its prompt from argv (positional, see buildPiArgs) and never
expects interactive input, so the Pi backend previously left cmd.Stdin
nil. Under systemd, the resulting /dev/null character device has been
observed not to satisfy Pi's readable-side wait, leaving runs stuck in
"working" forever (#2188).
Attach an explicit StdinPipe and close it immediately after Start so the
child sees an EOF on a FIFO, matching the pattern already used by the
Claude, Codex, Hermes, Kiro, and Kimi backends. The fix is defensive on
the daemon side because Pi is mid-refactor and is not accepting issues
upstream; once Pi itself stops blocking on stdin, this close is still
correct (a closed pipe is a no-op for a process that does not read it).
Test asserts the structural invariant: a shell-stub `pi` inspects
/proc/self/fd/0 and only emits a valid event stream when stdin is a
FIFO. If a future change drops the StdinPipe and stdin reverts to
/dev/null (char device), the stub exits non-zero and the test fails.
Adds rows to MODEL_PRICING for the Chinese-model SKUs listed on each
provider's official pricing page, so opencode / OpenRouter-routed
runtimes stop showing $0.00 in the dashboard for these models.
Sources (now cited inline above the table):
- DeepSeek: https://api-docs.deepseek.com/quick_start/pricing
- Moonshot: https://www.kimi.com/resources/kimi-k2-6-pricing
- Zhipu z.ai: https://docs.z.ai/guides/overview/pricing
Notes vs the closed PR #3170:
- Only SKUs that exist on the official pages are added. glm-z1*,
deepseek-v4-pro at $0.55/$2.19, kimi-k2.6 at K2's tier were all
hallucinated and are NOT included.
- deepseek-chat / deepseek-reasoner are routed by DeepSeek to
deepseek-v4-flash, so they share the v4-flash rate.
- deepseek-v4-pro is priced at the post-promo standard rate
($1.74 / $3.48), not the 75%-off promo that ends 2026-05-31. Brief
over-estimate beats a sudden 4x jump on June 1.
- glm-*-flash are priced at $0 because z.ai's free tiers are the
literal published price.
Co-authored-by: multica-agent <github@multica.ai>
Codex CLI's auto-memory subsystem writes summaries to
`$CODEX_HOME/memories/raw_memories.md` and `state_*.sqlite`, then reads
them back on the next turn. The daemon never cleared these files across
Reuse(), and Codex CLI may also pull from user-level `~/.codex/memories/`
entirely outside the per-task isolation. Either path leaks unrelated
context into new Multica tasks — multica#3130 saw `D:\Project\MoHaYu\
WowChat` Raw Memories injected into a brand-new issue's first turn.
Write a daemon-managed block into the per-task `config.toml` that sets
`features.memories = false`, `memories.generate_memories = false`, and
`memories.use_memories = false`. Codex then neither writes nor reads
its memory subsystem regardless of where the residual files live. The
user's global `~/.codex/config.toml` is never touched.
Pattern mirrors `ensureCodexMultiAgentConfig`: idempotent managed-block
upsert, two TOML layout variants (root dotted-key vs. inside a `[features]`
/ `[memories]` table) to satisfy strict toml-rs parsing, and a
`MULTICA_CODEX_MEMORY` env-var escape hatch.
MUL-2598
Co-authored-by: multica-agent <github@multica.ai>
Add Description field to RepoData structs so that workspace repo
descriptions (set via the settings UI) are preserved through
normalization and rendered in the agent brief as:
- <url> — <description>
When no description is set, the existing format is unchanged.
Closes MUL-2610
Co-authored-by: multica-agent <github@multica.ai>
- Add optional description field to WorkspaceRepo type
- Show description input below URL in edit mode
- Display description text in view mode
- Update isDirty to compare descriptions
- Update tests for new field
Co-authored-by: multica-agent <github@multica.ai>
Remediates two pgx security advisories in a single bump:
- CVE-2026-33816 (fixed in 5.9.0) — pgproto3 memory-safety DoS from
malformed messages sent by a malicious server.
- GHSA-j88v-2chj-qfwx / CVE-2026-41889 (fixed in 5.9.2) — SQL injection
via placeholder confusion with dollar-quoted literals under
QueryExecModeSimpleProtocol. Not reachable in this codebase (no
simple-protocol callers), but pinned to clear future scanner runs.
No source changes needed: pgx 5.9.x adds no breaking APIs over 5.8.x
(adds PG protocol 3.2 support, SCRAM-SHA-256-PLUS, OAuth, plus
pgtype/pgconn bug fixes). Minimum Go bumped to 1.25 in 5.9.0; repo
already on 1.26.1.
MUL-2597
Co-authored-by: multica-agent <github@multica.ai>
* fix: sort timeline entries by created_at on WebSocket append
When multiple agents post comments concurrently, WebSocket events may
arrive out of chronological order. The handlers blindly appended new
entries to the end of the cached timeline array, causing display
misordering. This fix sorts the array by created_at (with id as
tie-breaker) after each insert.
Changes:
- use-issue-timeline.ts: sort after comment:created and activity:created
- issue-ws-updaters.ts: sort in appendTimelineEntry
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(views): extract sortTimelineEntriesAsc helper, cover mutation onSuccess
Review feedback from @Bohan-J: useCreateComment.onSuccess also appends
unsorted (mutations.ts:558). When the local user posts a comment whose
HTTP response returns after a concurrent WS event, the unsorted append
leaves the cache misordered and the subsequent WS dedup skips re-sort.
Extract sortTimelineEntriesAsc helper and reuse it in all three web
cache writers:
- comment:created WS handler
- activity:created WS handler
- useCreateComment.onSuccess
Mobile keeps its own inline sort (apps/mobile/CLAUDE.md boundary).
Add regression tests for sort position (mid-insert and oldest-insert).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* Include k8s deployment instructions
* Use helm for deployment
* docs(self-host): add Helm / Kubernetes deployment to quickstart (en + zh)
* fix(helm): gate backend ExternalName alias behind a value
The unprefixed Service/backend in the chart is load-bearing, but as
written it limits the chart to one release per namespace and fails
helm install whenever a Service/backend already exists in the
namespace (without --take-ownership).
Gate the alias behind frontend.compatibility.backendAlias (default
true, so existing installs are unchanged). Operators running a web
image with a patched REMOTE_API_URL can set it to false to drop the
Service entirely. Document the one-release-per-namespace constraint
and the opt-out in values.yaml and the SELF_HOSTING.md Kubernetes
section.
Addresses review item #1 on PR #2377.
* fix(helm): add backend startupProbe so cold installs survive migrations
The entrypoint runs `./migrate up` before serving traffic. On a cold
cluster (Postgres still coming up) this can take minutes, during which
the livenessProbe (initialDelaySeconds 30 / periodSeconds 30) trips and
restarts the pod 1-2 times.
Add a startupProbe on /healthz (failureThreshold 30, periodSeconds 10,
~5 min budget). Kubernetes disables liveness/readiness until it passes,
so migrations finish without the pod being killed, and the aggressive
livenessProbe is untouched for steady-state. Update the SELF_HOSTING.md
install step, which no longer expects 1-2 restarts.
Addresses review item #2 on PR #2377.
* fix(helm): roll backend pods on config/secret change via checksum annotations
envFrom does not watch the referenced ConfigMap/Secret, and helm
upgrade alone does not change the pod template hash, so editing
values.yaml + `helm upgrade` left the old backend pods running stale
config.
Add checksum/config (hash of the rendered configmap.yaml) and
checksum/secret (hash of the live existingSecret via lookup, since it
is created out-of-band and has no chart template) to the backend pod
template. Config edits now actually re-roll the backend on upgrade,
and Secret rotations do too. lookup is empty under
`helm template`/`--dry-run`; that placeholder is harmless and
documented inline.
Addresses review item #3 on PR #2377.
* docs(self-host): sync quickstart with new startupProbe behavior
SELF_HOSTING.md was updated to reflect that the backend now stays
Running but not Ready while Postgres comes up (startupProbe absorbs
it, so no restart), but the EN/ZH quickstart docs still described the
pre-startupProbe behavior of "may restart 1-2 times". Bring them in
line.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Bohan Jiang <52446949+Bohan-J@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
The LICENSE file adds commercial restrictions on top of Apache 2.0, so the
README should not advertise the project as plain "Apache 2.0". Match the
actual terms.
Closes#3144
Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 12:47:33 +08:00
983 changed files with 108961 additions and 7232 deletions
@@ -176,6 +176,7 @@ make start-worktree # Start using .env.worktree
- Avoid broad refactors unless required by the task.
- New global (pre-workspace) routes MUST use a single word (`/login`, `/inbox`) or a `/{noun}/{verb}` pair (`/workspaces/new`). NEVER add hyphenated word-group root routes (`/new-workspace`, `/create-team`) — they collide with common user workspace names and force endless reserved-slug audits. Reserving the noun (`workspaces`) automatically protects the entire `/workspaces/*` subtree.
- The reserved-slug list lives in **one** place: `server/internal/handler/reserved_slugs.json`. The Go side embeds the JSON; `packages/core/paths/reserved-slugs.ts` is generated from it by `pnpm generate:reserved-slugs`. Edit the JSON, run the generator, commit both. CI re-runs the generator and fails on any drift, so a stale TS file cannot land.
- When you change a CLI command or flag, an API request/response field, or product behavior that a built-in skill documents (`server/internal/service/builtin_skills/*`), update that skill's `SKILL.md`**and** its `references/*-source-map.md` in the same PR. The built-in skills are source-traced contracts shipped to agents — if the code moves and the skill doesn't, it silently teaches stale behavior.
### API Response Compatibility
@@ -203,6 +204,14 @@ Every Go handler in `server/internal/handler/` follows these rules. The conventi
When adding a `Queries.Delete*` or `Queries.Update*` call, ask: "Where did this UUID come from?" If the answer is "raw user input that hasn't been validated," route it through `parseUUIDOrBadRequest` or a loader first.
### Dependency Declaration Rule
Every workspace (`apps/` and `packages/` directories) must explicitly declare all directly imported external packages in its own `package.json`. Relying on pnpm hoist to resolve undeclared imports (phantom deps) is prohibited — it causes production build failures when pnpm creates peer-dep variants.
- Use `"pkg": "catalog:"` to reference the shared version from `pnpm-workspace.yaml`.
- CI enforces this via `eslint-plugin-import-x/no-extraneous-dependencies`.
- Exception: `apps/mobile/` uses pinned versions (not `catalog:`) for packages tied to its own React/Expo version.
### Package Boundary Rules
These are hard constraints. Violating them breaks the cross-platform architecture:
`--mode` currently only accepts `create_issue` (creates a new issue on each run and assigns it to the agent). The server data model also defines `run_only`, but the daemon task path doesn't yet resolve a workspace for runs without an issue, so it's not exposed by the CLI. `--agent` accepts either a name or UUID.
`--mode` accepts `create_issue` (creates a new issue on each run and assigns it to the agent) or `run_only` (enqueues a direct agent task without creating an issue). `--agent` accepts either a name or UUID.
@@ -114,7 +114,7 @@ multica setup # Connect to Multica Cloud, log in, start daemon
multica setup # Configure, authenticate, and start the daemon
```
The daemon runs in the background and auto-detects agent CLIs (`claude`, `codex`, `copilot`, `openclaw`, `opencode`, `hermes`, `gemini`, `pi`, `cursor-agent`, `kimi`, `kiro-cli`) on your PATH.
The daemon runs in the background and auto-detects agent CLIs (`claude`, `codex`, `copilot`, `openclaw`, `opencode`, `hermes`, `gemini`, `pi`, `cursor-agent`, `kimi`, `kiro-cli`, `agy`) on your PATH.
### 2. Verify your runtime
@@ -124,7 +124,7 @@ Open your workspace in the Multica web app. Navigate to **Settings → Runtimes*
### 3. Create an agent
Go to **Settings → Agents** and click **New Agent**. Pick the runtime you just connected and choose a provider (Claude Code, Codex, GitHub Copilot CLI, OpenClaw, OpenCode, Hermes, Gemini, Pi, Cursor Agent, Kimi, or Kiro CLI). Give your agent a name — this is how it will appear on the board, in comments, and in assignments.
Go to **Settings → Agents** and click **New Agent**. Pick the runtime you just connected and choose a provider (Claude Code, Codex, GitHub Copilot CLI, OpenClaw, OpenCode, Hermes, Gemini, Pi, Cursor Agent, Kimi, Kiro CLI, or Antigravity). Give your agent a name — this is how it will appear on the board, in comments, and in assignments.
@@ -73,7 +73,7 @@ Open http://localhost:3000 in your browser. The Docker self-host stack defaults
- **Without email configured:** the verification code is generated server-side and printed to the backend container logs (look for `[DEV] Verification code for ...:`). Useful for one-off testing on a single machine.
- **Deterministic local/private testing:** set `APP_ENV=development` and `MULTICA_DEV_VERIFICATION_CODE=888888` in `.env`, then restart the backend. This fixed code is ignored when `APP_ENV=production`.
Changes to `ALLOW_SIGNUP` and `GOOGLE_CLIENT_ID` also take effect after restarting the backend / compose stack. The web UI reads both from `/api/config` at runtime, so no web rebuild is needed.
Changes to `ALLOW_SIGNUP`, `DISABLE_WORKSPACE_CREATION`, and `GOOGLE_CLIENT_ID` also take effect after restarting the backend / compose stack. The web UI reads all three from `/api/config` at runtime, so no web rebuild is needed. See [Advanced Configuration → Signup Controls](SELF_HOSTING_ADVANCED.md#signup-controls-optional) for the recommended sequence to lock down workspace creation.
> **Warning:** do **not** set `MULTICA_DEV_VERIFICATION_CODE` on a publicly reachable instance — anyone who knows an email address can then log in with that fixed code.
@@ -135,6 +135,262 @@ multica daemon status
3. Go to **Settings → Agents** and create a new agent
4. Create an issue and assign it to your agent — it will pick up the task automatically
---
## Kubernetes Deployment (Alternative)
If you already run a Kubernetes cluster, you can deploy Multica there instead of Docker Compose using the released OCI Helm chart at `oci://ghcr.io/multica-ai/charts/multica` or the source chart at [`deploy/helm/multica/`](deploy/helm/multica/). It targets a typical k3s / k8s setup with an Ingress controller and a default `ReadWriteOnce` StorageClass — authored against k3s + Traefik + `local-path`, and should work on any cluster with minor tweaks.
The chart creates the following resources in the target namespace:
-`multica-postgres` — `pgvector/pgvector:pg17` backed by a 10Gi PVC
-`multica-backend` — Go API/WS server. Backed by a 5Gi `ReadWriteOnce` uploads PVC by default; set `backend.uploads.persistence.enabled=false` when you have configured S3 (`backend.config.s3Bucket`) and don't want the chart to declare the PVC at all.
-`multica-frontend` — Next.js standalone server
- Two `Ingress` resources: one for the web host, one for the backend host
-`multica-config` ConfigMap (rendered from `values.yaml`)
The `multica-secrets` Secret is **not** managed by the chart — you create it once with `kubectl` so real values never need to land in git.
> **One release per namespace:** the prebuilt `multica-web` image bakes `REMOTE_API_URL=http://backend:8080` at build time, so the chart ships an ExternalName Service literally named `backend`. Because that name is unprefixed, you can run only one Multica release per namespace, and `helm install` will fail if a `Service/backend` already exists there (pass `--take-ownership`, or use a dedicated namespace). If you build a web image with a patched `REMOTE_API_URL`, set `frontend.compatibility.backendAlias: false` to drop the alias.
> **Prerequisites:** `kubectl` and `helm` (v3.13+ for `--take-ownership`, or v4+) configured for the target cluster, an Ingress controller (Traefik / NGINX), and a default StorageClass.
### Step 1 — Point hostnames at the cluster
The chart defaults to `multica.dev.lan` (web) and `api.multica.dev.lan` (backend). Pick one of:
- **`/etc/hosts`** on every machine that needs access (developer laptops + the machine running the daemon):
```text
192.168.1.206 multica.dev.lan api.multica.dev.lan
```
Replace `192.168.1.206` with any node IP where your Ingress controller's Service is reachable.
- **Local DNS** (Pi-hole, Unbound, etc.): add A records for both hostnames pointing at the cluster Ingress IP.
To use different hostnames, override the matching values at install time (see [Step 4](#step-4--install-the-chart)) — `ingress.frontend.host`, `ingress.backend.host`, plus `backend.config.appUrl`, `backend.config.frontendOrigin`, `backend.config.localUploadBaseUrl`, and `backend.config.googleRedirectUri`.
### Step 2 — Create the namespace
```bash
kubectl create namespace multica
```
### Step 3 — Create the `multica-secrets` Secret
The chart references this Secret by name. Create it once with random values:
Released chart versions strip the leading `v` from the Git tag. For example, release tag `v0.3.5` publishes chart version `0.3.5`; the chart defaults the backend and frontend image tags to `v0.3.5`.
To override defaults, export the chart values, edit them, and pass them with `-f`:
```bash
helm show values oci://ghcr.io/multica-ai/charts/multica \
On a cold cluster the backend can sit `Running` but not `Ready` for a few minutes while it waits on PostgreSQL and runs migrations — a startupProbe absorbs this, so the pod should not restart. Once the backend reports `Ready`, migrations have completed and `/healthz` returns OK:
The chart defaults to `APP_ENV=production` (set in `values.yaml` under `backend.config.appEnv`), and there is no fixed verification code by default. Pick one of the following to log in — the same three options as the Docker setup:
- **Recommended (production):** patch the Secret with a real Resend key, then restart the backend:
Real verification codes will be sent to the email address you enter. See [Advanced Configuration → Email](SELF_HOSTING_ADVANCED.md#email-required-for-authentication).
- **Without email configured:** the verification code is generated server-side and printed to the backend pod logs (look for `[DEV] Verification code for ...:`). Useful for one-off testing.
- **Deterministic local/private testing:** set `backend.config.appEnv: development` in your values file and `MULTICA_DEV_VERIFICATION_CODE=888888` in the Secret, then `helm upgrade` and restart. This fixed code is ignored when `APP_ENV=production`.
`ALLOW_SIGNUP`, `DISABLE_WORKSPACE_CREATION`, and `GOOGLE_CLIENT_ID` likewise live under `backend.config.*` in `values.yaml` (as `allowSignup`, `disableWorkspaceCreation`, and `googleClientId`). After `helm upgrade`, the backend pod will roll automatically because the ConfigMap hash changes; the web UI reads all three from `/api/config` at runtime, so no web rebuild is needed.
> **Warning:** do **not** set `MULTICA_DEV_VERIFICATION_CODE` on a publicly reachable instance — anyone who knows an email address can then log in with that fixed code.
### Step 6 — Install CLI & Start Daemon
The daemon runs on your local machine, not in the cluster. Install the CLI and an AI agent as in [Step 3](#step-3--install-cli--start-daemon) above, then point the CLI at your Ingress hostnames:
```bash
multica setup self-host \
--server-url http://api.multica.dev.lan \
--app-url http://multica.dev.lan
```
Make sure the machine running the daemon has the same `/etc/hosts` (or DNS) entries from [Step 1](#step-1--point-hostnames-at-the-cluster).
### Updating
To pull the latest images without changing the chart version when your values still use the mutable `latest` image tag:
> **Upgrading from `v0.3.4` to `v0.3.5+` fails with `refusing to drop legacy daily rollups: ...`?** Same migration guard as the Docker path — see [Usage Dashboard Rollup → Option C](#option-c--backfill-history-first-then-schedule). Run the backfill against the same database the chart is using (`kubectl -n multica exec deploy/multica-backend -- ./backfill_task_usage_hourly --sleep-between-slices=2s`), then restart the backend deployment to re-apply migrations.
### Tearing down
```bash
# Remove the workloads but keep the PVCs and the Secret
helm -n multica uninstall multica
# Wipe everything, including PostgreSQL data and uploads
kubectl delete namespace multica
```
---
## Usage Dashboard Rollup (Required)
Starting with `v0.3.5`, the Usage / Runtime dashboards read from a derived `task_usage_hourly` table rather than directly from `task_usage`. Raw `task_usage` rows are written by the backend on every task, but the dashboard only sees data after `rollup_task_usage_hourly()` runs and aggregates them into `task_usage_hourly`.
**The bundled `pgvector/pgvector:pg17` image does NOT include `pg_cron`.** If nothing schedules the rollup, the dashboard will stay at zero forever even though `task_usage` is populated. You have three supported options — pick one before relying on the dashboard.
> **Upgrading from `v0.3.4` to `v0.3.5+`** with existing `task_usage` history: migration `103` is fail-closed and will abort `migrate up` with `refusing to drop legacy daily rollups: …`. Run `backfill_task_usage_hourly` first (Option C below), then re-run the upgrade. **Fresh installs** are exempted by that guard and migrate cleanly — but the dashboard will still stay at zero until you pick Option A or Option B.
### Option A — External cron / systemd-timer (simplest)
Schedule a 5-minute job that calls `rollup_task_usage_hourly()`. It is idempotent and watermark-driven, so a missed tick catches up on the next run.
Or as a systemd timer + service if you prefer that surface. The function returns the number of (upserted + deleted-empty) rows; it's safe to call concurrently with itself (an advisory lock makes overlapping runs no-op) and safe to call alongside `backfill_task_usage_hourly`.
### Option B — Swap Postgres for an image that ships `pg_cron`
If you'd rather have Postgres schedule itself, replace `pgvector/pgvector:pg17` in `docker-compose.selfhost.yml` with an image that bundles both `pgvector` and `pg_cron` (e.g. `supabase/postgres`, or your own build of `pgvector/pgvector` with `pg_cron` added and `shared_preload_libraries=pg_cron` set on the server). Then, once:
```sql
CREATE EXTENSION IF NOT EXISTS pg_cron;
SELECT cron.schedule(
'rollup_task_usage_hourly',
'*/5 * * * *',
$$SELECT rollup_task_usage_hourly()$$
);
```
`shared_preload_libraries` requires a Postgres restart to take effect — set it in `postgresql.conf` (or via the image's documented mechanism) before bringing the container up.
### Option C — Backfill history first, then schedule
If you're upgrading from `v0.3.4 → v0.3.5+` and already have `task_usage` rows (or you just want the dashboard to show historical data on a fresh install that you've been running for a while), run the bundled backfill command once before scheduling the rollup:
```bash
# Backfills task_usage_hourly from all historical task_usage rows and stamps
# the rollup watermark. Idempotent — safe to re-run.
On a database with years of data this can scan tens of millions of rows; `--sleep-between-slices=2s` throttles the read pressure. Use `--months-back N` (plus `--force-partial`) if you only want the last N months. Once it finishes, set up Option A or Option B so new buckets keep flowing.
After upgrading, re-run `migrate up` (or restart the backend container — migrations run automatically on startup) to apply migration `103` cleanly.
## Stopping Services
If you installed via the install script:
@@ -175,6 +431,8 @@ docker compose -f docker-compose.selfhost.yml up -d
Pin `MULTICA_IMAGE_TAG` in `.env` to an exact version like `v0.2.4` if you want to stay on a specific release. Migrations run automatically on backend startup.
If the selected GHCR tag has not been published yet, fall back to `make selfhost-build` or `docker compose -f docker-compose.selfhost.yml -f docker-compose.selfhost.build.yml up -d --build`.
> **Upgrading from `v0.3.4` to `v0.3.5+` fails with `refusing to drop legacy daily rollups: ...`?** That's migration `103`'s fail-closed guard: it requires `task_usage_hourly` to be seeded before the legacy daily rollups are dropped. Run `backfill_task_usage_hourly` first, then re-run the upgrade. Full instructions in [Usage Dashboard Rollup → Option C](#option-c--backfill-history-first-then-schedule).
| `SMTP_TLS` | TLS mode. `implicit` (aliases `smtps`, `ssl`) forces SMTPS on connect; port `465` auto-enables it. Unset / `starttls` upgrades via STARTTLS | `starttls` |
| `SMTP_TLS_INSECURE` | Set `true` to skip TLS certificate verification (self-signed / private CA certs) | `false` |
STARTTLS is used automatically when advertised by the server. Port 465 (SMTPS / implicit TLS) is not currently supported - use ports 25 or 587 with STARTTLS.
STARTTLS is used automatically when advertised by the server. Port 465 (SMTPS / implicit TLS) is supported and auto-enables implicit TLS; set `SMTP_TLS=implicit` (aliases `smtps`, `ssl`) to force it on a non-standard port.
> **Note:** If neither Resend nor SMTP is configured, generated verification codes are printed to backend logs — copy them from there to log in. A fixed local testing code (e.g. `888888`) is **opt-in only**: set `MULTICA_DEV_VERIFICATION_CODE=888888` in `.env` and keep `APP_ENV` non-production. The Docker self-host stack pins `APP_ENV=production`, so the shortcut is ignored there. **Never enable a fixed code on a publicly reachable instance.**
@@ -67,8 +68,20 @@ Changes take effect after restarting the backend / compose stack. The web UI rea
| `ALLOW_SIGNUP` | Set to `false` to disable new user signups on a private instance |
| `ALLOWED_EMAIL_DOMAINS` | Optional comma-separated allowlist of email domains |
| `DISABLE_WORKSPACE_CREATION` | Set to `true` to make `POST /api/workspaces` return 403 for every caller — users can only join workspaces they were invited to |
Changes take effect after restarting the backend / compose stack. The web UI reads `ALLOW_SIGNUP` from `/api/config` at runtime, so no web rebuild is needed.
Changes take effect after restarting the backend / compose stack. The web UI reads `ALLOW_SIGNUP` and `DISABLE_WORKSPACE_CREATION` from `/api/config` at runtime, so no web rebuild is needed.
#### Locking down workspace creation
`ALLOW_SIGNUP=false` blocks new accounts from being created, but it does **not** block an already-signed-in user from creating another workspace via `POST /api/workspaces`. On a self-hosted instance where every issue/repo/agent must be visible to the platform admin, set `DISABLE_WORKSPACE_CREATION=true` to close that gap. The recommended bootstrap sequence is:
1. Start the instance with `DISABLE_WORKSPACE_CREATION=false` (the default).
2. Sign in as the admin and create the shared workspace.
3. Set `DISABLE_WORKSPACE_CREATION=true` and restart the backend. Optionally set `ALLOW_SIGNUP=false` at the same time if you also want to block new account creation.
4. Going forward, additional users join via invitation only — the "Create workspace" affordance is hidden in the UI and any direct API call returns 403.
> Note: setting `ALLOW_SIGNUP=false` blocks **all** new account creation, including users who already have a pending invitation. If you need invited users to be able to sign up but not create their own workspaces, keep `ALLOW_SIGNUP=true` (optionally combined with `ALLOWED_EMAIL_DOMAINS` / `ALLOWED_EMAILS`) and only flip `DISABLE_WORKSPACE_CREATION=true`.
### File Storage (Optional)
@@ -99,7 +112,7 @@ The `Secure` flag on session cookies is derived automatically from the scheme of
| `PORT` | `8080` | Backend server port |
| `METRICS_ADDR` | empty | Optional Prometheus metrics listener, for example `127.0.0.1:9090` |
| `FRONTEND_PORT` | `3000` | Frontend port |
| `CORS_ALLOWED_ORIGINS` | Value of `FRONTEND_ORIGIN` | Comma-separated list of allowed origins |
| `CORS_ALLOWED_ORIGINS` | Value of `FRONTEND_ORIGIN` | Comma-separated list of allowed origins. Governs **both** the HTTP CORS allowlist **and** the WebSocket `Origin` check. A browser origin that isn't listed here (and isn't `localhost`) has its real-time WebSocket upgrade rejected with `403`, so live updates stop working until a manual refresh. |
@@ -166,6 +179,111 @@ The Docker Compose setup runs migrations automatically. If you need to run them
cd server && go run ./cmd/migrate up
```
## Usage Dashboard Rollup
The Usage and Runtime dashboards read from `task_usage_hourly`, a derived table populated by `rollup_task_usage_hourly()`. The function is **not** scheduled out of the box on the default self-host stack: the bundled `pgvector/pgvector:pg17` image ships without `pg_cron`, and the backend does not run the rollup in-process either. Until something calls it on a schedule, raw `task_usage` rows will keep arriving while the dashboard stays at zero.
Pick one of the supported paths:
### Option A — External cron / systemd-timer
The simplest path. Schedule `SELECT rollup_task_usage_hourly()` every five minutes from any out-of-band timer (host crontab, systemd timer, sidecar container, Kubernetes CronJob). It is idempotent and watermark-driven — overlapping runs are no-ops on an internal advisory lock, and a missed tick catches up on the next run.
If you'd rather have Postgres schedule itself, swap the bundled image for one that ships both `pgvector` and `pg_cron` (e.g. `supabase/postgres`, or a custom build of `pgvector/pgvector` with `pg_cron` added). `pg_cron` requires `shared_preload_libraries=pg_cron` in `postgresql.conf`, which only takes effect on Postgres restart — set it before bringing the container up.
Then register the job once:
```sql
CREATEEXTENSIONIFNOTEXISTSpg_cron;
SELECTcron.schedule(
'rollup_task_usage_hourly',
'*/5 * * * *',
$$SELECTrollup_task_usage_hourly()$$
);
```
`pg_cron.database_name` defaults to `postgres`; if your Multica database has a different name, point `pg_cron` at it via that GUC or run `cron.schedule_in_database(...)` instead.
### Option C — Backfill historical data first
`rollup_task_usage_hourly()` only processes new buckets after it starts running. If you already have `task_usage` rows from before the rollup was scheduled — most commonly when upgrading from `v0.3.4` to `v0.3.5+`, or on a fresh install that has been collecting usage for a while — run `backfill_task_usage_hourly` once to seed historical buckets, then set up Option A or Option B for ongoing rollups.
The command walks `task_usage`'s full time range in monthly slices and calls the same idempotent primitive the cron path uses, so it's safe to re-run, to interrupt with Ctrl-C, and to run concurrently with an already-scheduled rollup. Flags:
| Flag | Description |
|---|---|
| `--sleep-between-slices` | Pause between monthly slices to throttle read pressure on busy databases (e.g. `2s`). Recommended on production DBs with years of history. |
| `--months-back N` | Only backfill the last N months. **Requires `--force-partial`** because the watermark still advances past the skipped older buckets — those are permanently abandoned. |
| `--dry-run` | Log slices that would be processed without writing anything. |
After backfill completes, the rollup-state watermark is stamped to `now() - 5 minutes`, so the first scheduled tick after backfill does not redo history.
### `v0.3.4 → v0.3.5+` upgrade order
Migration `103` adds a fail-closed guard that refuses to drop the legacy daily rollups until `task_usage_hourly` has caught up. If you run `migrate up` straight through on a database with existing `task_usage` rows, it aborts with:
task_usage latest event (...) by more than 01:00:00 — backfill is
incomplete or pg_cron is not running. Run cmd/backfill_task_usage_hourly
(and let pg_cron catch up) before re-running migrate
```
Recovery is straightforward: run `backfill_task_usage_hourly` (Option C above), then re-run `migrate up` (or restart the backend container — migrations run automatically on startup). **Fresh installs are exempt** — the guard short-circuits when `task_usage` is empty, and migrations succeed, but the dashboard will still stay at zero until you set up Option A or Option B.
## Manual Setup (Without Docker Compose)
If you prefer to build and run services manually:
@@ -219,6 +337,8 @@ multica.example.com {
}
```
> Even on a single domain, set `FRONTEND_ORIGIN` / `CORS_ALLOWED_ORIGINS` to that public origin (e.g. `https://multica.example.com`) on the backend. The backend's default origin allowlist is `localhost` only, so without this it rejects the WebSocket upgrade from the public URL with `403` and real-time updates silently stop working. See [LAN / Non-localhost Access](#lan--non-localhost-access).
**Separate-domain layout** — frontend and backend on different hostnames:
```
@@ -338,6 +458,8 @@ HTTP requests (issues, comments, uploads) work on LAN out of the box — Next.js
`NEXT_PUBLIC_WS_URL` is a build-time variable (see `Dockerfile.web`), so setting it only in `environment:` on the pre-built image has no effect — you must use the `selfhost.build.yml` override that rebuilds the image.
**Also required: allowlist the browser origin.** The two options above fix the WebSocket *upgrade proxying*, but a second, independent setting gates the connection: the backend validates the WebSocket `Origin` header against an allowlist that defaults to `localhost` only. When you open Multica from any other origin — a LAN IP **or a public domain behind a reverse proxy** — set `CORS_ALLOWED_ORIGINS` (or `FRONTEND_ORIGIN`) on the backend to that exact origin and restart, exactly as shown under [LAN / Non-localhost Access](#lan--non-localhost-access) above. Otherwise the upgrade is refused with `403`: the backend logs `websocket: request origin not allowed by Upgrader.CheckOrigin` and the browser console loops `disconnected, reconnecting in 3s`, while HTTP requests (and manual page refreshes) keep working because they are same-origin to the page. The single value covers both HTTP CORS and the WebSocket origin check.
> **Note:** If you need to hard-code a different public API / WebSocket endpoint into the web image for any other reason, use the same source-build override: `docker compose -f docker-compose.selfhost.yml -f docker-compose.selfhost.build.yml up -d --build`.
You're a frontend code-review agent. When an issue comes in, read the diff first. Focus only on:
- Styling issues (tailwind class names, box model)
- Accessibility (a11y)
Don't change code — leave suggestions in a comment.
```
空のままにすると(デフォルト)、エージェントは追加の制約なしに、基盤となる AI コーディングツールのネイティブな動作を使用します。
## モデルを選ぶ
ほとんどの AI コーディングツールはモデル選択をサポートしています(例えば Claude Code では Sonnet と Opus のどちらかを選べます)。空のままにするとツール自体のデフォルト値が使われ、明示的に 1 つを選ぶとそのモデルが実行されます。各ツールがサポートするモデルは、[AI コーディングツール比較](/providers)にまとめられています。
**価値の高いシークレットは `custom_env` に入れないでください**(本番データベースのパスワード、root レベルのトークンなど)。エージェントには**権限範囲が限定された専用の資格情報**(読み取り専用 API キー、単一スコープの PAT)を使用し、定期的にローテーションしてください。データベースのバックアップと DB 監査は、依然として意味のある露出面として残ります。
description: 에이전트를 생성하는 데 필요한 최소 필드와 모든 선택적 설정 — 시스템 지침, 환경 변수, 공개 범위, 동시 실행 제한, 보관.
---
import { Callout } from "fumadocs-ui/components/callout";
[에이전트](/agents)를 생성하는 데는 단 두 가지만 필요합니다. **이름** 하나와 **[AI 코딩 도구](/providers) 선택** 하나입니다. 나머지는 모두 선택 사항입니다 — 시스템 지침, 모델, 환경 변수, CLI 인자, 공개 범위, 동시 실행 제한 — 기본값으로도 문제없이 작동합니다. 먼저 실행해 보고 나중에 조정하세요. 모든 필드는 언제든지 변경할 수 있습니다.
## 에이전트 생성
전제 조건: 사용 중인 기기에 지원되는 [AI 코딩 도구](/providers)가 최소 하나는 설치되어 있고(Claude Code, Codex 등) [데몬](/daemon-runtimes)이 실행 중이어야 합니다. 아직 여기까지 준비되지 않았다면 [Cloud 빠른 시작](/cloud-quickstart)이나 [자체 호스팅 빠른 시작](/self-host-quickstart)부터 시작하세요.
준비가 끝나면 워크스페이스의 **에이전트** 페이지로 이동해 **+ New**를 클릭하거나 CLI를 사용하세요.
```bash
multica agent create
```
이 폼에는 필수 필드가 두 개뿐입니다. **이름**(워크스페이스 내에서 고유해야 함)과 **런타임**(= AI 코딩 도구 선택)입니다. 나머지 모든 필드는 아래에서 섹션별로 다룹니다.
## AI 코딩 도구 선택
각 런타임은 특정 AI 코딩 도구를 기반으로 합니다. Multica는 그중 12개를 지원합니다. 가장 일반적인 선택지는 다음과 같습니다.
| 도구 | 적합한 경우 |
|---|---|
| **Claude Code** | Anthropic의 공식 도구로, 가장 완성도 높은 기능 집합을 제공합니다. **첫 선택으로 가장 좋습니다** |
| **Codex** | OpenAI 제품으로, 주류 대안입니다 |
| **Cursor** | Cursor 에디터 생태계 사용자 |
| **Copilot** | GitHub 계정 권한을 활용하는 팀 |
| **Gemini** | Google 생태계 사용자 |
나머지 7개(Antigravity, Hermes, Kimi, Kiro CLI, OpenCode, Pi, OpenClaw)와 각 도구의 전체 기능 비교표(세션 재개, MCP, 스킬 주입 경로, 모델 선택)는 [AI 코딩 도구 비교](/providers)에서 다룹니다.
## 시스템 지침 작성
**시스템 지침**(`instructions`)은 모든 작업 앞에 추가되어, 에이전트가 어떤 역할을 맡고 어떤 규칙을 따라야 하는지 알려줍니다.
```text
You're a frontend code-review agent. When an issue comes in, read the diff first. Focus only on:
- Styling issues (tailwind class names, box model)
- Accessibility (a11y)
Don't change code — leave suggestions in a comment.
```
비워 두면(기본값) 에이전트는 추가 제약 없이 기반이 되는 AI 코딩 도구의 기본 동작을 사용합니다.
## 모델 선택
대부분의 AI 코딩 도구는 모델 선택을 지원합니다(예를 들어 Claude Code는 Sonnet과 Opus 중에서 고를 수 있습니다). 비워 두면 도구 자체의 기본값이 사용되고, 명시적으로 하나를 선택하면 그 모델이 실행됩니다. 각 도구가 지원하는 모델은 [AI 코딩 도구 비교](/providers)에 정리되어 있습니다.
모델 변경은 **새 작업에만 적용됩니다**. 이미 디스패치된 작업은 디스패치 시점에 고정된 모델로 계속 실행됩니다.
## 사용자 지정 환경 변수 (custom_env)
**사용자 지정 환경 변수**(`custom_env`)를 사용하면 작업 실행 시점에 추가 환경 변수를 주입할 수 있습니다. 대표적인 용도는 API 키 설정이나 업스트림 엔드포인트 전환입니다.
```
ANTHROPIC_API_KEY = sk-...
ANTHROPIC_BASE_URL = https://my-proxy.example.com
```
시스템에 핵심적인 변수는 재정의할 수 없습니다. `PATH`, `HOME`, `USER`, `SHELL`, `TERM`, `CODEX_HOME`, 그리고 `MULTICA_*`로 시작하는 모든 키는 데몬이 조용히 무시합니다(경고 로그는 남기지만 오류는 발생하지 않습니다).
<Callout type="warning">
**`custom_env`의 값은 Multica 서버 데이터베이스에 평문으로 저장됩니다.** 에이전트 list/get 응답은 더 이상 환경 변수 값을 전혀 포함하지 않으며, 불투명한 개수만 반환합니다. 실제 값을 읽으려면 워크스페이스 owner 또는 admin이 전용으로 감사되는 `GET /api/agents/{id}/env` 엔드포인트(CLI: `multica agent env get <id>`)를 호출해야 합니다. 작업을 실행 중인 에이전트는 호스트의 owner 자격 증명을 이용해 다른 에이전트의 환경 변수를 드러낼 수 없습니다. 이 엔드포인트는 에이전트 액터 세션을 거부합니다.
**가치가 높은 secret은 `custom_env`에 넣지 마세요**(운영 데이터베이스 비밀번호, root 수준 토큰 등). 에이전트에는 **권한 범위가 제한된 전용 자격 증명**(읽기 전용 API 키, 단일 스코프 PAT)을 사용하고 정기적으로 교체하세요. 데이터베이스 백업과 DB 감사는 여전히 의미 있는 노출 표면으로 남아 있습니다.
</Callout>
## 사용자 지정 CLI 인자 (custom_args)
**사용자 지정 CLI 인자**(`custom_args`)는 AI 코딩 도구의 명령줄에 하나씩 차례로 덧붙는 문자열 배열입니다.
```json
["--max-turns", "100", "--append-system-prompt", "always respond in Chinese"]
```
최종 명령은 다음과 같이 만들어집니다.
```bash
claude --model <model> --max-turns 100 --append-system-prompt "always respond in Chinese" [...]
```
인자는 셸을 거치지 않고 있는 그대로 전달되므로(주입 위험 없음), 특정 플래그가 인식되는지 여부는 AI 코딩 도구 자체에 달려 있습니다. 이 부분은 도구마다 상당한 차이가 있습니다.
<Callout type="tip">
`custom_env`와 `custom_args`에는 엄격한 상한이 없지만, 실제로는 **각각 10개 이내로 유지하세요**. 너무 많으면 명령줄이 길어지고 시작이 느려지며 유지 관리도 어려워집니다.
</Callout>
## 공개 범위
- **워크스페이스**(`workspace`) — 워크스페이스의 모든 멤버가 할당할 수 있습니다
- **비공개**(`private`) — 워크스페이스 owner, admin, 또는 에이전트 생성자만 할당할 수 있습니다
새 에이전트는 기본적으로 `private`입니다.
**비공개라고 해서 숨겨지는 것은 아닙니다** — 모든 멤버가 목록에서 비공개 에이전트의 이름과 설명을 볼 수 있으며, 다만 민감한 구성은 읽을 수 없습니다(환경 변수 값은 에이전트 list/get 응답에 절대 나타나지 않으며, MCP 구성은 owner가 아닌 사용자에게는 마스킹됩니다). 자세한 의미는 [에이전트 → 누가 에이전트를 할당할 수 있나요](/agents#who-can-assign-an-agent)를 참고하세요.
## 동시 실행 제한
**동시 실행 제한**(`max_concurrent_tasks`)은 이 에이전트가 한 번에 병렬로 실행할 수 있는 작업 수를 제어합니다. 기본값은 **6**입니다. 상한에 도달한 새 작업은 거부되지 않고 대기열에서 대기합니다.
이것은 두 단계 제한 중 "에이전트 계층"에 불과합니다. 데몬 자체가 더 넓은 상한(기본값 20)을 적용하며, 둘 중 더 빡빡한 쪽이 우선합니다. 자세한 내용은 [데몬과 런타임 → 병렬로 몇 개의 작업을 실행할 수 있나요](/daemon-runtimes#how-many-tasks-can-run-in-parallel)에 있습니다.
이 값을 변경해도 **이미 실행 중인 작업은 취소되지 않으며**, 다음에 처리될 작업부터만 적용됩니다.
## 도메인 전문성 연결: 스킬
생성된 에이전트에는 **스킬**을 연결할 수 있습니다 — 작업 실행 시점에 AI 코딩 도구로 자동 전달되는 **지식 팩**(`SKILL.md` + 보조 파일)입니다. 새 스킬을 만들거나, GitHub 또는 ClawHub에서 가져오거나, 기기에 있는 기존 스킬 디렉터리에서 스캔할 수 있습니다. [스킬](/skills)을 참고하세요.
## 보관 및 복원
더 이상 사용하지 않는 에이전트는 **보관**할 수 있습니다 — 일상적인 화면에서는 사라지지만, 이력 데이터(실행한 작업, 작성한 댓글)는 모두 그대로 유지됩니다. 언제든지 **복원**하여 다시 작업에 투입할 수 있습니다.
<Callout type="warning">
**보관은 해당 에이전트에 속한 완료되지 않은 모든 작업을 즉시 취소합니다** — 실행 중, 디스패치됨, 대기 중인 작업이 모두 `cancelled`로 표시되며 계속 진행되지 않습니다. 진행 중인 중요한 작업이 있다면 보관하기 전에 끝까지 완료되도록 두세요.
</Callout>
보관된 에이전트에는 새 작업을 할당할 수 없습니다.
## 다음 단계
- [스킬](/skills) — 에이전트에 지식 팩 연결하기
- [AI 코딩 도구 비교](/providers) — 12개 도구 전체의 기능 비교표
- [에이전트에게 이슈 할당하기](/assigning-issues) — 새로 만든 에이전트를 작업에 투입하기
@@ -21,7 +21,7 @@ The form has only two required fields: **name** (unique within the workspace) an
## Pick an AI coding tool
Each runtime is backed by a specific AI coding tool. Multica supports 11 of them. The most common choices:
Each runtime is backed by a specific AI coding tool. Multica supports 12 of them. The most common choices:
| Tool | Good for |
|---|---|
@@ -31,7 +31,7 @@ Each runtime is backed by a specific AI coding tool. Multica supports 11 of them
| **Copilot** | Teams leveraging their GitHub account entitlements |
| **Gemini** | Users in the Google ecosystem |
The other six (Hermes, Kimi, Kiro CLI, OpenCode, Pi, OpenClaw), along with each tool's full capability matrix (session resume, MCP, skill injection path, model selection), are covered in [AI coding tools comparison](/providers).
The other seven (Antigravity, Hermes, Kimi, Kiro CLI, OpenCode, Pi, OpenClaw), along with each tool's full capability matrix (session resume, MCP, skill injection path, model selection), are covered in [AI coding tools comparison](/providers).
System-critical variables cannot be overridden: `PATH`, `HOME`, `USER`, `SHELL`, `TERM`, `CODEX_HOME`, and any key starting with `MULTICA_*` are silently ignored by the daemon (with a warn log — no error).
<Callout type="warning">
**Values in `custom_env` are stored in plaintext in Multica's server database.** Non-creators and non-workspace-admins can't see the values (the API returns `****`), but they're still visible in database backups and DB audits.
**Values in `custom_env` are stored in plaintext in Multica's server database.** Agent list/get responses no longer carry env values at all — only an opaque count. Reading values requires a workspace owner or admin to hit the dedicated, audited `GET /api/agents/{id}/env` endpoint (CLI: `multica agent env get <id>`). Agents running tasks can NOT use their host's owner credentials to reveal env on other agents — the endpoint denies agent-actor sessions.
**Don't put high-value secrets in `custom_env`** (production database passwords, root-level tokens, etc.). Use **dedicated, limited-scope credentials** for agents (read-only API keys, single-scope PATs), and rotate them regularly.
**Don't put high-value secrets in `custom_env`** (production database passwords, root-level tokens, etc.). Use **dedicated, limited-scope credentials** for agents (read-only API keys, single-scope PATs), and rotate them regularly. Database backups and DB audits remain a meaningful exposure surface.
</Callout>
## Custom CLI arguments (custom_args)
@@ -96,7 +96,7 @@ Arguments are passed as-is, not through a shell (no injection risk), but whether
New agents default to `private`.
**Private does not mean hidden** — every member sees a private agent's name and description in the list, they just can't see sensitive config fields (the values in `custom_env` and MCP config are masked). Full meaning in [Agents → Who can assign an agent](/agents#who-can-assign-an-agent).
**Private does not mean hidden** — every member sees a private agent's name and description in the list, they just can't read sensitive config (env values never appear in agent list/get responses; MCP config is masked for non-owners). Full meaning in [Agents → Who can assign an agent](/agents#who-can-assign-an-agent).
## Concurrency limit
@@ -123,5 +123,5 @@ Archived agents can't be assigned new tasks.
## Next steps
- [Skills](/skills) — attach knowledge packs to an agent
- [AI coding tools comparison](/providers) — full capability matrix across all 11 tools
- [AI coding tools comparison](/providers) — full capability matrix across all 12 tools
- [Assigning issues to agents](/assigning-issues) — put your new agent to work
description: "에이전트는 Multica 워크스페이스의 일급 멤버입니다 — 이슈를 할당받고, 댓글을 달고, @로 멘션될 수 있습니다. 사람과의 핵심 차이는, 에이전트는 스스로 작업을 시작하며 알림을 받지 않는다는 점입니다."
---
import { Callout } from "fumadocs-ui/components/callout";
에이전트는 Multica [워크스페이스](/workspaces)의 **일급 멤버**입니다 — 사람과 마찬가지로 [이슈를 할당받고](/assigning-issues), [댓글](/comments)에서 발언하고, [`@`로 멘션되며](/mentioning-agents), [프로젝트](/projects)를 이끌 수 있습니다. 핵심 차이는 이것입니다. 모든 에이전트 뒤에는 여러분의 기기에서 실행되는 [AI 코딩 도구](/providers)가 있습니다. 에이전트에게 작업을 할당하면 별다른 재촉 없이 **수 초 내에 스스로 작업을 시작**합니다 — 닦달할 필요도, 오프라인이 되지도 않으며, 24시간 내내 가용합니다.
## 에이전트가 할 수 있는 일
에이전트는 사람과 동일한 "멤버" 표면을 사용하며, UI에서는 거의 구분되지 않습니다.
- **[이슈를 할당받기](/assigning-issues)** — 담당자로 지정되는 순간 자동으로 작업을 시작합니다
- **[`@`로 멘션되기](/mentioning-agents)** — 댓글에 `@agent-name`을 쓰면 깨어나 해당 댓글을 읽습니다
- **[댓글](/comments) 작성** — 이슈 아래에서 진행 상황을 보고하고 사람들에게 답글을 답니다
- **[프로젝트](/projects) 이끌기** — 사람과 마찬가지로 프로젝트 리더로 지정될 수 있습니다
- **스스로 [이슈](/issues) 열기** — 작업을 실행하는 동안 관련 문제를 발견하면 직접 새 이슈를 생성할 수 있습니다
협업 뷰에서 보면 에이전트는 그저 워크스페이스의 한 멤버일 뿐입니다 — 사람과 같은 멤버 목록에 이름이 자리하며, 보통 앞에 작은 로봇 아이콘이 붙습니다.
## 사람과 다른 점
몇 가지 핵심 차이는 실제로 에이전트를 사용하기 시작해야 비로소 드러납니다.
- **스스로 시작합니다** — 이슈를 할당하거나 `@`로 멘션하면 Multica가 즉시 해당 작업을 에이전트의 런타임에 디스패치합니다. 사람처럼 메시지를 보고 응답할 때까지 기다리지 않습니다. 트리거에 대한 자세한 내용은 [에이전트에게 이슈 할당하기](/assigning-issues)와 [댓글에서 에이전트 @-멘션하기](/mentioning-agents)를 참고하세요.
- **알림을 받지 않습니다** — 에이전트는 여러분의 [인박스](/inbox) 건너편에 결코 나타나지 않으며, `@all`의 수신 대상에도 포함되지 않습니다. 에이전트는 "메시지를 읽는 수신자"가 아니라 "작업을 실행하도록 트리거되는 작업 단위"입니다.
- **하나의 AI 코딩 도구에 묶여 있습니다** — 모든 에이전트는 런타임에 묶여 있습니다(런타임 = 데몬 × 하나의 AI 코딩 도구. [데몬과 런타임](/daemon-runtimes) 참고). 도구가 오프라인이면 에이전트는 작업할 수 없으며, 새 작업은 런타임이 돌아올 때까지 대기합니다.
- **보관할 수 있습니다** — 더 이상 사용하지 않는 에이전트를 보관하면 일상적인 뷰에서 사라지며, 원하면 언제든지 복원할 수 있습니다. 보관하면 현재 실행 중인 작업은 모두 취소됩니다.
## 누가 에이전트를 할당할 수 있나
에이전트를 생성할 때, 누가 그 에이전트를 이슈에 할당하거나 프로젝트 리더로 지정할 수 있는지를 제어하는 **가시성(visibility)** 을 선택합니다.
- **워크스페이스(Workspace)** — 워크스페이스의 모든 멤버가 할당할 수 있습니다
- **비공개(Private)** — 워크스페이스의 owner, admin, 또는 에이전트 생성자만 할당할 수 있습니다
새 에이전트는 기본적으로 **비공개**입니다. 전체 워크스페이스에서 사용할 수 있게 하려면, 생성 시 가시성을 `workspace`로 설정하거나 이후에 에이전트 설정에서 변경하세요. 전체 역할-권한 매트릭스는 [멤버와 역할](/members-roles)을 참고하세요.
<Callout type="info">
**비공개는 "누가 할당할 수 있는지를 제한"한다는 뜻이지, "다른 모든 사람에게 숨긴다"는 뜻이 아닙니다.** 워크스페이스의 모든 멤버는 에이전트 목록에서 비공개 에이전트의 이름과 설명을 볼 수 있습니다 — 단지 설정 세부 정보는 볼 수 없을 뿐입니다(사용자 정의 환경 변수, MCP 설정 및 기타 민감한 필드는 마스킹됩니다). "단 한 사람에게만 보이게" 하고 싶다면, 현재로서는 불가능합니다.
</Callout>
## 다음 단계
- [에이전트 생성 및 구성](/agents-create) — 에이전트를 만드는 방법
- [스킬](/skills) — 에이전트에 지식 팩 연결하기
- [스쿼드](/squads) — 적합한 에이전트가 적합한 이슈를 맡도록 리더 아래 에이전트를 그룹으로 묶기
- [데몬과 런타임](/daemon-runtimes) — 에이전트가 실제로 실행되기 위해 필요한 것
description: 이슈를 에이전트에게 넘기면 작업이 끝날 때까지 공식 담당자로 인계받습니다 — 전체 컨텍스트를 갖고 이슈 상태와 필드를 변경할 수 있습니다.
---
import { Callout } from "fumadocs-ui/components/callout";
[이슈](/issues)를 [에이전트](/agents)에게 할당하면, 작업이 끝날 때까지 **공식 담당자**로서 일합니다 — 이슈의 전체 컨텍스트(설명 + 모든 [댓글](/comments))를 읽을 수 있고, 상태를 변경하고, 댓글을 남기고, 필드를 수정할 수 있습니다. 이것은 Multica의 네 가지 트리거 경로 중 **가장 일반적이고 가장 무거운** 방식입니다. 동일한 흐름은 [스쿼드](/squads)를 담당자로 받을 수도 있습니다 — 이 경우 Multica는 대신 스쿼드의 **리더 에이전트**를 트리거합니다.
| 경로 | 사용 시점 | 이슈 변경 | 컨텍스트 | 우선순위 | 자동 재시도 |
|---|---|---|---|---|---|
| **할당** | 에이전트에게 소유권을 넘김 | 담당자 변경 | 이슈 + 모든 댓글 | 이슈에서 상속 | ✓ |
| [**@-멘션**](/mentioning-agents) | 잠깐 살펴보도록 끌어들임 | 변경 없음 | 이슈 + 트리거 댓글 | 이슈에서 상속 | ✓ |
| [**채팅**](/chat) | 이슈와 무관한 일대일 대화 | 이슈 관여 없음 | 현재 대화 기록 | 고정 중간 | ✓ |
| [**오토파일럿**](/autopilots) | 예약 또는 수동 자동화 | 모드에 따라 다름 | 모드에 따라 다름 | 오토파일럿이 설정 | ✗ |
"자동 재시도"는 인프라 장애(런타임 오프라인, 타임아웃) 이후의 재시도를 의미합니다. 에이전트 쪽의 비즈니스 오류(예: 모델이 오류를 보고하는 경우)는 재시도되지 않습니다. 자세한 내용은 [**작업**](/tasks)을 참고하세요.
## UI에서 할당하기
이슈 상세 페이지에서 **담당자** 선택기를 클릭하세요. 워크스페이스의 모든 멤버, 보관되지 않은 모든 에이전트, 보관되지 않은 모든 [스쿼드](/squads)가 목록에 표시됩니다. 에이전트(또는 스쿼드)를 선택하면 이슈가 즉시 할당됩니다.
몇 가지 규칙이 있습니다.
- **워크스페이스 에이전트**는 어떤 멤버든 할당할 수 있습니다. **프라이빗 에이전트**는 owner 또는 워크스페이스 admin만 할당할 수 있습니다.
- **온라인 런타임이 있는** 에이전트에게만 할당할 수 있습니다 — 아무도 실행하고 있지 않은 에이전트는 선택기에서 사용 불가로 표시됩니다.
- 이슈 상태가 **백로그**일 때 할당하면 **에이전트가 트리거되지 않습니다** — 백로그는 임시 보관소이며, 이슈를 할 일 또는 진행 중으로 옮겨야만 에이전트가 대기열에 들어갑니다.
`--to`는 멤버 사용자 이름 또는 에이전트 이름(퍼지 매칭)을 받습니다. 이름이 겹칠 때 — 예를 들어 에이전트 `J` 옆에 `Cursor - J`가 있을 때 — 대신 `--to-id <uuid>`를 전달하세요. 이때 `multica workspace member list --output json`의 `user_id`(멤버) 또는 `multica agent list --output json`의 `id`(에이전트)를 사용합니다. UUID 매칭은 엄격하고 모호하지 않으므로, 스크립트나 CLI를 구동하는 에이전트에게 적합합니다. `--to`와 `--to-id`는 함께 쓸 수 없습니다.
할당 해제:
```bash
multica issue assign MUL-42 --unassign
```
## 할당 이후에 일어나는 일
백로그가 아닌 이슈가 에이전트에게 할당되면, Multica는 즉시 백그라운드에서 다음을 수행합니다.
1. 이슈에서 상속한 우선순위로 `queued` 상태의 `task`를 대기열에 넣고, 에이전트가 있는 런타임으로 라우팅합니다.
2. 에이전트의 데몬이 다음 폴링 시 `task`를 가져가 `dispatched`로 전환합니다.
3. 에이전트가 작업을 시작하면 `task`가 `running`으로 이동합니다. 완료되면 `completed` 또는 `failed`가 됩니다.
4. 실행 중에 에이전트는 이슈의 상태를 변경하고, 댓글을 남기고, 필드를 수정할 수 있습니다 — 이러한 동작은 에이전트의 신원으로 표시됩니다.
**에이전트가 오프라인인 경우**, `task`는 대기열에서 기다립니다 — **5분 후 `runtime_offline` 사유로 타임아웃되어 실패합니다**. 재시도 가능한 소스(할당, @-멘션, 채팅)에 대해서는 Multica가 자동으로 다시 대기열에 넣습니다. 전체 재시도 규칙은 [**작업**](/tasks)을 참고하세요.
할당하면 에이전트가 이슈에 자동으로 구독됩니다 — 다만 Multica에서는 **에이전트가 인박스 알림을 받지 않습니다**(멤버만 받습니다). 이 구독은 내부 기록 관리일 뿐이며 사용자에게 보이는 부작용은 없습니다.
## 재할당 또는 할당 해제
담당자를 에이전트 A에서 에이전트 B로 변경하면:
1. **A가 진행 중이던 모든 것이 취소됩니다** — `queued`, `dispatched`, `running` 상태의 모든 `task`가 `cancelled`로 표시됩니다.
2. **B에게 즉시 새 `task`가 대기열에 들어갑니다**(이슈가 백로그가 아니고 B에게 온라인 런타임이 있는 경우).
<Callout type="warning">
**재할당은 이 이슈의 모든 활성 `task`를 취소합니다 — 이전 담당자의 것만이 아닙니다.** 다른 에이전트가 @-멘션 때문에 이 이슈에서 작업 중이라면, 그 `task`도 함께 취소됩니다. 현재로서는 단일 에이전트의 `task`만 따로 취소하는 UI 동작이 없습니다.
</Callout>
할당 해제(`--unassign` 또는 선택기에서 "none" 선택)는 모든 활성 `task` 항목을 `cancelled`로 표시하며 **새 항목을 대기열에 넣지 않습니다**. 기존 구독은 자동으로 정리되지 않습니다 — 이전 담당자는 구독 목록에 남아 있습니다(다만 여전히 인박스 알림은 받지 않습니다).
## 이슈당 에이전트당 활성 `task`가 하나뿐인 이유
**단일 에이전트는 같은 이슈에서 어느 시점에든 최대 하나의 `queued` 또는 `dispatched` `task`만 가질 수 있습니다.** 데이터베이스 수준의 고유 인덱스와 클레임 로직이 이를 강제합니다 — 중복 대기열 등록과 동시 실행이 서로를 덮어쓰는 것을 방지합니다.
하지만 **서로 다른 에이전트는 같은 이슈에서 병렬로 작업할 수 있습니다** — 예를 들어 에이전트 A가 담당자이고 에이전트 B가 @-멘션된 경우, 두 `task` 항목이 각자의 런타임에서 실행되며 공존할 수 있습니다. 전체 직렬/동시 실행 규칙은 [**작업**](/tasks)을 참고하세요.
## 다음 단계
- [**댓글에서 에이전트를 @-멘션하기**](/mentioning-agents) — 담당자와 상태를 건드리지 않는 더 가벼운 트리거
description: 이메일 + 인증 코드 로그인, Google OAuth, 회원가입 허용 목록, 로컬 테스트 코드를 구성합니다.
---
import { Callout } from "fumadocs-ui/components/callout";
import { Mermaid } from "@/components/mermaid";
Multica는 두 가지 로그인 방식을 지원합니다. **이메일 + 인증 코드**(기본값)와 **Google OAuth**(선택). 로그인에 성공하면 서버가 30일 수명의 JWT 쿠키를 발급합니다. 이 페이지에서는 각 방식을 구성하는 방법, 누가 회원가입할 수 있는지 제한하는 방법, 그리고 자체 호스팅 배포에서 가장 빠지기 쉬운 함정 하나를 다룹니다.
아래에서 참조하는 환경 변수 목록은 [환경 변수](/environment-variables)를 참고하세요. 토큰 사용법과 수명 주기 세부 사항은 [인증 및 토큰](/auth-tokens)을 참고하세요.
## 이메일 + 인증 코드 로그인의 작동 방식
사용자가 로그인 페이지에서 이메일을 입력합니다 → 서버가 6자리 코드를 보냅니다 → 사용자가 코드를 입력합니다 → 서버가 코드를 검증합니다 → JWT 쿠키가 발급됩니다. 표준 흐름입니다. 두 가지 전송 백엔드가 지원되므로 배포 환경에 맞는 쪽을 선택하세요.
### 옵션 A: Resend (클라우드 / 공용 인터넷 배포에 권장)
1. [Resend](https://resend.com/) 계정을 만들고 도메인을 인증합니다
2. API 키를 생성합니다
3. 환경 변수를 설정합니다:
```bash
RESEND_API_KEY=re_xxxxxxxxxxxxxxxx
RESEND_FROM_EMAIL=noreply@yourdomain.com # must be a domain verified in Resend
```
4. 서버를 재시작합니다
### 옵션 B: SMTP relay (자체 호스팅 / 온프레미스 배포용)
배포 환경에서 `api.resend.com`에 접근할 수 없거나 이미 내부 메일 relay(Microsoft Exchange, Postfix, 온프레미스 SendGrid 등)가 있는 경우에 사용하세요. 둘 다 설정된 경우 `SMTP_HOST`가 `RESEND_API_KEY`보다 우선합니다. `SMTP_HOST`가 비어 있지 않으면 `RESEND_API_KEY`도 함께 구성되어 있더라도 서버는 항상 SMTP를 통하므로, 인증 및 초대 메일이 내부 네트워크를 벗어나는 일이 결코 없습니다.
SMTP 경로는 대부분의 온프레미스 메일 서버(특히 Microsoft Exchange의 receive connector)가 노출하는 세 가지 relay 모드를 지원합니다:
| 모드 | 포트 | 인증 | TLS |
|---|---|---|---|
| 익명 내부 relay | `25` | 없음 — IP / 서브넷으로 제출을 신뢰 | 전송 경로상 없음(내부 세그먼트 전용) |
| 암묵적 TLS (SMTPS) | `465` | — | **아직 지원하지 않음** — 포트 25 또는 587을 사용하세요 |
**포트 25의 익명 Exchange relay** — 자격 증명 없이 신뢰된 서브넷에서 오는 메일을 받아들이는 일반적인 "internal SMTP relay" / Exchange 익명 receive connector:
```bash
SMTP_HOST=exchange.internal.example.com
SMTP_PORT=25
SMTP_USERNAME=
SMTP_PASSWORD=
SMTP_TLS_INSECURE=false
RESEND_FROM_EMAIL=noreply@yourdomain.com # reused as the From: header
```
**포트 587의 인증된 제출** — 서비스 계정이 필요한 relay용. 서버가 STARTTLS 지원을 알리면 자동으로 업그레이드됩니다:
```bash
SMTP_HOST=smtp.internal.example.com
SMTP_PORT=587
SMTP_USERNAME=multica
SMTP_PASSWORD=...
SMTP_TLS_INSECURE=false # set true only for self-signed / private CA
RESEND_FROM_EMAIL=noreply@yourdomain.com
```
시작 시 서버는 선택한 제공자를 출력합니다. 예를 들어 `EmailService: SMTP relay exchange.internal.example.com:25 from=noreply@example.com`(또는 `Resend API` / `DEV mode`)와 같이 표시됩니다. 비밀번호는 절대 로그에 기록되지 않습니다. 재시작 후 SMTP 줄이 보이지 않는다면 `SMTP_HOST`가 프로세스에 도달하지 못한 것이므로, 컨테이너 환경(`docker compose -f docker-compose.selfhost.yml exec backend env | grep SMTP`)을 확인하세요.
**둘 다 설정하지 않으면**: 서버는 오류를 내지 않지만, **전송되어야 했던 모든 이메일이 서버의 stdout에만 기록됩니다**. 로컬 개발에는 편리하지만(로그에서 코드를 복사하면 됩니다), 프로덕션에서는 블랙홀이 됩니다.
## 고정 로컬 테스트 코드
<Callout type="warning">
**공용으로 접근 가능한 인스턴스에서는 고정 인증 코드를 활성화하지 마세요.**
프로덕션이 아닌 인스턴스가 기본적으로 `888888`을 받아들이던 기존 동작은 제거되었습니다. 명시적으로 구성하지 않는 한 `888888`을 입력하는 것은 다른 잘못된 코드와 동일하게 처리됩니다.
이메일 백엔드를 전혀 구성하지 않은(Resend도 SMTP도 없는) 로컬 개발에서는 서버 로그에 출력되는 생성된 코드를 사용해야 합니다. 결정적인 로컬/사설 자동화가 필요하다면 `MULTICA_DEV_VERIFICATION_CODE`를 `888888` 같은 6자리 값으로 설정하고 `APP_ENV`를 프로덕션이 아닌 값으로 유지하세요:
```bash
APP_ENV=development
MULTICA_DEV_VERIFICATION_CODE=888888
```
이 단축키는 `APP_ENV=production`일 때 무시됩니다.
</Callout>
프로덕션 배포에서는 `MULTICA_DEV_VERIFICATION_CODE`를 비워 두고 `APP_ENV=production`으로 설정해야 합니다. `make selfhost` / `docker-compose.selfhost.yml`로 배포하는 경우 `APP_ENV`는 기본적으로 `production`입니다.
## Google OAuth 구성
선택 사항입니다. 구성하지 않으면 이메일 + 인증 코드만 사용할 수 있고, 구성하면 로그인 페이지에 "Google로 로그인" 버튼이 추가됩니다.
**런타임에 적용됨**: 프런트엔드는 `/api/config`를 통해 런타임에 이 설정을 읽습니다. 변경 후 서버를 재시작하면 프런트엔드가 다시 빌드하거나 다시 배포할 필요 없이 새 값을 가져옵니다.
<Callout type="warning">
**리디렉션 URI는 Google Console과 `GOOGLE_REDIRECT_URI` 양쪽에서 정확히 일치해야 합니다** — 프로토콜(`http` vs `https`), 끝의 슬래시, 포트까지 포함합니다. 조금이라도 일치하지 않으면 Google이 전체 OAuth 흐름을 거부하며, 사용자에게 표시되는 오류는 `redirect_uri_mismatch`입니다.
</Callout>
## 누가 회원가입할 수 있는지 제한하기
세 개의 환경 변수가 우선순위에 따라 조합됩니다:
<Mermaid chart={`
graph TD
Start[New user first sign-in] --> A{Email in<br/>ALLOWED_EMAILS?}
A -- Yes --> Allow[Allow signup]
A -- No --> B{Domain in<br/>ALLOWED_EMAIL_DOMAINS?}
B -- Yes --> Allow
B -- No --> C{Any allowlist<br/>non-empty?}
C -- Yes --> Block[Reject]
C -- No --> D{ALLOW_SIGNUP<br/>= true?}
D -- Yes --> Allow
D -- No --> Block
`} />
**기존 사용자는 언제든 다시 로그인할 수 있습니다** — 회원가입 허용 목록은 **최초 회원가입**에만 적용되며, 돌아오는 사용자는 막지 않습니다.
- **`ALLOWED_EMAILS`** (최고 우선순위) — 명시적 이메일 허용 목록, 쉼표로 구분합니다. **비어 있지 않으면 목록에 있는 이메일만 회원가입할 수 있습니다.**
- **`ALLOWED_EMAIL_DOMAINS`** — 도메인 허용 목록, 쉼표로 구분합니다(예: `company.io,partner.com`).
- **`ALLOW_SIGNUP`** — 마스터 스위치, 기본값 `true`. `false`로 설정하면 회원가입이 완전히 비활성화됩니다.
<Callout type="warning">
**세 계층은 OR가 아니라 AND 의미입니다.** 흔한 잘못된 직관은 `ALLOWED_EMAIL_DOMAINS=company.io` + `ALLOW_SIGNUP=true`가 "company.io에 더해 다른 모든 사람을 허용"한다는 것입니다. 그렇지 **않습니다**. 어느 계층이든 비어 있지 않은 값이 있으면 **그에 일치하지 않는 이메일은 곧바로 거부되며**, `ALLOW_SIGNUP=true`는 그것을 무효로 만들지 못합니다.
실제로 "모두 허용"하려면 세 변수를 모두 비워 두세요(또는 `ALLOW_SIGNUP=true`를 유지하세요).
</Callout>
**일반적인 구성**:
| 목표 | 구성 |
|---|---|
| 내부 전용, `company.io` 직원만 | `ALLOWED_EMAIL_DOMAINS=company.io` |
| 내부 + 소수의 외부 협업자 | `ALLOWED_EMAIL_DOMAINS=company.io` + 협업자 주소를 `ALLOWED_EMAILS`에 추가 |
| 셀프서비스 회원가입을 완전히 비활성화, 초대 전용 | `ALLOW_SIGNUP=false` |
| 개방형 회원가입(프로덕션에는 권장하지 않음) | 셋 다 비움 |
## 회원가입을 비활성화해도 사람을 초대할 수 있나요?
**이미 Multica 계정이 있는 사람만 가능합니다.** 초대 수락은 회원가입 허용 목록을 확인하지 않습니다. 초대받은 사람이 이미 회원가입한 상태라면(예: 다른 워크스페이스에서), 초대 링크를 클릭하고 로그인하면 수락할 수 있습니다.
**하지만 한 번도 회원가입하지 않은 사람은 초대로 구제할 수 없습니다.** 수락하기 전에 먼저 로그인해야 하고, 로그인의 첫 단계(인증 코드 요청)는 회원가입 허용 목록 검사를 거칩니다. `ALLOW_SIGNUP=false`이거나 그들의 이메일이 `ALLOWED_EMAILS` / `ALLOWED_EMAIL_DOMAINS`에 없으면 **회원가입을 완료할 수 없으며**, 따라서 초대도 수락할 수 없습니다.
아직 회원가입하지 않은 외부 협업자를 초대하려면: 그들의 이메일을 `ALLOWED_EMAILS`에 임시로 추가하고, 그들이 회원가입하고 초대를 수락하기를 기다린 다음 항목을 제거하세요.
초대를 만들고 사용하는 방법은 [멤버 및 역할](/members-roles)을 참고하세요.
## 다음
- [환경 변수](/environment-variables) — 이 페이지에서 사용하는 모든 변수의 전체 정의
- [인증 및 토큰](/auth-tokens) — JWT / PAT / 데몬 토큰의 분류와 사용법
- [문제 해결](/troubleshooting) — 인증 코드 미수신, OAuth `redirect_uri_mismatch`, 회원가입 거부
Use this when the deployment can't reach `api.resend.com` or you already have an internal mail relay (Exchange, Postfix, on-prem SendGrid, etc.). `SMTP_HOST` takes priority over `RESEND_API_KEY` when both are set.
Use this when the deployment can't reach `api.resend.com` or you already have an internal mail relay (Microsoft Exchange, Postfix, on-prem SendGrid, etc.). `SMTP_HOST` takes priority over `RESEND_API_KEY` when both are set — if `SMTP_HOST` is non-empty the server always goes through SMTP, even if `RESEND_API_KEY` is also configured, so verification and invite mail never leaves the internal network.
The SMTP path supports the three relay modes most on-premise mail servers (notably Microsoft Exchange's receive connectors) expose:
| Mode | Port | Auth | TLS |
|---|---|---|---|
| Anonymous internal relay | `25` | none — submission is trusted by IP / subnet | none on the wire (internal segment only) |
| Implicit TLS (SMTPS) | `465` | optional (`SMTP_USERNAME` + `SMTP_PASSWORD`) | TLS handshake on connect — auto-enabled on port `465`, or force on a non-standard port with `SMTP_TLS=implicit` |
**Anonymous Exchange relay on port 25** — the typical "internal SMTP relay" / Exchange anonymous receive connector that accepts mail from a trusted subnet without credentials:
```bash
SMTP_HOST=smtp.internal.example.com
SMTP_PORT=587 # default 25; use 587 for STARTTLS submission
SMTP_USERNAME=multica # leave empty for unauthenticated relay
SMTP_PASSWORD=...
SMTP_TLS_INSECURE=false # set true only for self-signed / private CA
SMTP_HOST=exchange.internal.example.com
SMTP_PORT=25
SMTP_USERNAME=
SMTP_PASSWORD=
SMTP_TLS_INSECURE=false
RESEND_FROM_EMAIL=noreply@yourdomain.com # reused as the From: header
```
STARTTLS is upgraded automatically when the server advertises it. Port 465 (SMTPS / implicit TLS) is **not** currently supported — use port 25 or 587.
**Authenticated submission on port 587** — for relays that require a service account; STARTTLS is upgraded automatically when the server advertises it:
```bash
SMTP_HOST=smtp.internal.example.com
SMTP_PORT=587
SMTP_USERNAME=multica
SMTP_PASSWORD=...
SMTP_TLS_INSECURE=false # set true only for self-signed / private CA
RESEND_FROM_EMAIL=noreply@yourdomain.com
```
**Implicit TLS (SMTPS) on port 465** — for providers that only offer SMTPS and don't advertise STARTTLS (e.g. Aliyun / Tencent enterprise mail). Port `465` auto-enables implicit TLS; `SMTP_TLS=implicit` (aliases: `smtps`, `ssl`) forces it on a non-standard SMTPS port:
```bash
SMTP_HOST=smtp.qiye.aliyun.com
SMTP_PORT=465 # implicit TLS auto-enabled on 465
SMTP_USERNAME=multica@yourdomain.com
SMTP_PASSWORD=...
SMTP_TLS=implicit # optional on 465; required on a non-standard SMTPS port
RESEND_FROM_EMAIL=noreply@yourdomain.com
```
At startup the server prints which provider it picked, including the negotiated TLS mode — for example `EmailService: SMTP relay exchange.internal.example.com:25 (starttls) from=noreply@example.com` or `… smtp.qiye.aliyun.com:465 (implicit-tls) from=…` (or `Resend API` / `DEV mode`). The password is never logged. If you don't see the SMTP line after restart, `SMTP_HOST` didn't reach the process — check the container env (`docker compose -f docker-compose.selfhost.yml exec backend env | grep SMTP`).
**What happens if you set neither**: the server doesn't error, but **every email that should have been sent is written to the server's stdout only**. Handy for local development (copy the code from the logs); in production it's a black hole.
**どちらも `/api/daemon/*` にアクセスできますが、スコープが異なります。** PAT は**ユーザー全体**を表し、一度認証されると、あなたが所属するすべてのワークスペースを見ることができます。デーモントークンは作成時点で単一のワークスペースに固定され、そのワークスペースのリソースにしかアクセスできません。本番環境では、デーモンはデーモントークンで実行してください。手軽さのために PAT を使う近道を選ばないでください。そうしないと、デーモンに必要な以上にはるかに大きな権限を与えてしまいます。
**セルフホストの運用者は注意してください**: 公開デプロイでは `MULTICA_DEV_VERIFICATION_CODE` を空のままにしておいてください。固定のローカルテストコードを有効にすると、`APP_ENV` が production 以外の間は、コードをリクエストできる人なら誰でもその値でサインインできてしまいます。[セルフホスト認証の構成](/auth-setup)を参照してください。
</Callout>
### Google OAuth
**Sign in with Google** をクリックして、標準の OAuth コールバックを通過してください。セルフホストには `GOOGLE_CLIENT_ID`、`GOOGLE_CLIENT_SECRET`、そしてリダイレクト URI を構成する必要があります — [セルフホスト認証の構成](/auth-setup)を参照してください。
## PAT の作成、表示、失効
PAT の**作成**は 2 つの方法で行えます。
- **Web UI**: 設定 → 個人アクセストークン → 新しいトークン
- **CLI**: `multica login` は、まだローカル PAT がない場合に自動的に 1 つ作成します
<Callout type="warning">
**完全な PAT は作成時に正確に 1 回だけ表示されます。** 更新したりダイアログを閉じたりした後は、二度と見ることができません。
Multica はデータベースに PAT のハッシュだけを保存します — サーバーでさえ元の値を取得できません。すぐにコピーして保存してください。紛失した場合の唯一の手段は、失効させて新しく作り直すことです。
</Callout>
既存の PAT の**表示**(名前、作成時刻、最終使用時刻 — 完全なトークンは**含みません**)は、設定 → 個人アクセストークンにあります。
PAT の**失効**: 一覧で Revoke をクリックしてください。失効はすぐに反映されます — その PAT で送られる次のリクエストは 401 で拒否されます。
**PAT는 거의 모든 것에 접근할 수 있습니다** — 이는 "완전한 당신"을 나타냅니다. 데몬 토큰은 데몬에 필요한 일, 즉 작업을 가져오고 결과를 보고하는 것만 할 수 있습니다.
**둘 다 `/api/daemon/*`에 접근할 수 있지만, 범위가 다릅니다.** PAT는 **사용자 전체**를 나타내며 — 일단 인증되면 당신이 속한 모든 워크스페이스를 볼 수 있습니다. 데몬 토큰은 생성 시점에 단일 워크스페이스에 고정되며 해당 워크스페이스의 리소스에만 접근할 수 있습니다. 프로덕션에서는 데몬을 데몬 토큰으로 실행하세요. 편의를 위해 PAT를 사용하는 지름길을 택하지 마세요. 그렇지 않으면 데몬에 필요한 것보다 훨씬 많은 권한을 부여하게 됩니다.
## 로그인
### Email + 인증 코드
1. 이메일을 입력하면 서버가 6자리 코드를 보냅니다.
2. 코드를 입력하면 서버가 JWT 쿠키를 발급(브라우저)하거나 PAT로 교환(CLI)합니다.
<Callout type="warning">
**자체 호스팅 운영자는 주의하세요**: 공개 배포에서는 `MULTICA_DEV_VERIFICATION_CODE`를 비워 두세요. 고정된 로컬 테스트 코드를 활성화하면, `APP_ENV`가 production이 아닌 동안 코드를 요청할 수 있는 누구나 그 값으로 로그인할 수 있습니다. [자체 호스팅 인증 구성](/auth-setup)을 참고하세요.
</Callout>
### Google OAuth
**Sign in with Google**을 클릭하고 표준 OAuth 콜백을 거치세요. 자체 호스팅에는 `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET`, 그리고 리디렉션 URI를 구성해야 합니다 — [자체 호스팅 인증 구성](/auth-setup)을 참고하세요.
## PAT 생성, 조회, 취소
PAT **생성**은 두 가지 방법으로 할 수 있습니다:
- **Web UI**: 설정 → API 토큰 → 새 토큰
- **CLI**: `multica login`은 아직 로컬 PAT가 없으면 자동으로 하나를 생성합니다
<Callout type="warning">
**전체 PAT는 생성될 때 정확히 한 번만 표시됩니다.** 새로 고침하거나 대화상자를 닫은 후에는 다시 볼 수 없습니다.
Multica는 데이터베이스에 PAT의 해시만 저장합니다 — 서버조차 원본을 가져올 수 없습니다. 즉시 복사하여 저장하세요. 분실하면 유일한 방법은 취소하고 새로 생성하는 것입니다.
</Callout>
기존 PAT **조회**(이름, 생성 시각, 마지막 사용 시각 — 전체 토큰은 **포함하지 않음**)는 설정 → API 토큰에 있습니다.
PAT **취소**: 목록에서 Revoke를 클릭하세요. 취소는 즉시 적용됩니다 — 그 PAT로 보낸 다음 요청은 401로 거부됩니다.
description: 에이전트가 cron 스케줄, 인바운드 webhook으로 작업을 시작하거나, UI나 CLI로 한 번 수동 트리거하게 합니다.
---
import { Callout } from "fumadocs-ui/components/callout";
오토파일럿은 [에이전트](/agents)가 **스케줄에 따라 자동으로 작업을 시작**하게 합니다 — cron 표현식과 타임존을 설정하면, 여러분이 아무것도 트리거하지 않아도 Multica가 알아서 [`task`](/tasks)를 디스패치합니다. 정기 점검, 반복 보고서, 야간 정리 작업 등 "상시 지시(standing order)" 형태의 작업에 잘 맞습니다. 나머지 세 가지 트리거 경로([할당](/assigning-issues), [@-멘션](/mentioning-agents), [채팅](/chat) — 모두 여러분이 직접 시작하는 방식)와 비교했을 때, 오토파일럿의 핵심 차이는 **시간 기반**이라는 점입니다.
## 오토파일럿 구성하기
워크스페이스의 **오토파일럿** 페이지에서 새 오토파일럿을 만듭니다. 다음 항목을 설정합니다.
- **이름(Name)** — 표시 이름
- **에이전트(Agent)** — 실행을 디스패치할 대상
- **우선순위(Priority)** — 생성되는 `task`에 상속됩니다(이슈 우선순위와 동일한 의미)
- **설명 / 프롬프트(Description / prompt)** — 매 실행마다 에이전트가 받는 작업 설명
- **실행 모드(Execution mode)** — 아래 참고
- **트리거(Triggers)** — `schedule`(cron + 타임존) 또는 `webhook` 중 최소 하나
## 실행 모드 선택하기
오토파일럿에는 두 가지 실행 모드가 있습니다. **"이슈 생성" 모드부터 시작하세요.**
- **이슈 생성 모드(Create issue mode)**(`create_issue`) — 기본값이며 **권장**됩니다. 각 트리거는 먼저 워크스페이스에 이슈를 생성한 다음(제목에는 현재 단일 플레이스홀더 `{{date}}` 하나만 지원되며, 이는 `YYYY-MM-DD` 형식의 UTC 날짜로 보간됩니다. 그 외의 `{{...}}` 토큰은 생성 시점에 거부되므로, 오타가 이슈 제목에 리터럴 문자열로 조용히 들어가는 일을 막습니다), 일반 할당 흐름을 통해 그 이슈를 에이전트에게 할당합니다. 모든 작업은 수동으로 할당한 이슈와 동일한 히스토리, 댓글, 상태를 가진 채 이슈 보드에 올라갑니다.
- **실행 전용 모드(Run-only mode)**(`run_only`) — 이슈 생성을 건너뛰고 `task`를 곧바로 대기열에 넣습니다. 이 실행은 보드에 표시되지 않으며 — 오토파일럿의 실행 히스토리에서만 확인할 수 있습니다.
## 스케줄에 따라 실행하기
모든 오토파일럿에는 최소 하나의 `schedule` 트리거가 필요합니다. Cron은 **표준 5필드 형식**(분 시 일 월 요일)을 사용하며, 최소 단위는 **1분**입니다(초 단위는 없음). 타임존은 IANA 형식(예: `Asia/Shanghai`)이며, cron 표현식이 어느 타임존으로 해석될지를 결정합니다.
몇 가지 예시입니다.
- `0 9 * * 1-5`, `Asia/Shanghai` — 평일 베이징 시간 오전 9시
- `*/30 * * * *`, `UTC` — 30분마다
- `0 3 * * *`, `UTC` — 매일 UTC 오전 3시
Multica 서버는 **30초**마다 만료된 트리거를 스캔합니다 — **실제 발화 시각은 최대 30초까지 지연될 수 있으며**, 초 단위로 정확하지는 않습니다. 발화 시각 즈음에 서버가 재시작되면, 시작 시 놓친 트리거를 따라잡습니다(아무것도 유실되지 않지만 즉시 발화됩니다).
## 수동으로 한 번 트리거하기
오토파일럿을 디버깅하는 동안 cron을 기다리지 않으려면 수동으로 트리거하세요.
- UI: 오토파일럿 상세 페이지에서 "Run now" 클릭
- CLI:
```bash
multica autopilot trigger <autopilot-id>
```
수동 트리거는 `schedule` 트리거와 완전히 동일한 실행 흐름을 거치며 — 실행 레코드의 `source` 필드만 `manual`로 표시됩니다.
## Webhook으로 트리거하기
오토파일럿은 인바운드 HTTP webhook으로도 발화할 수 있습니다. 오토파일럿 상세 페이지에서 **Webhook** 트리거를 추가하면, Multica가 다음과 같은 형태의 고유 URL을 생성합니다.
`event` 필드를 제공하지 않으면, Multica는 일반적인 헤더와 본문 필드로부터 이를 추론합니다(`X-GitHub-Event` + 본문 `action`, `X-Gitlab-Event`, `X-Event-Type`, 본문의 `event`/`type`/`action`). 어느 것도 일치하지 않으면 이벤트는 `webhook.received`가 됩니다.
GitHub 같은 소스를 구성할 때는 content type을 `application/json`으로 설정하세요 — 폼 인코딩된 webhook payload는 허용되지 않습니다.
### 이벤트 필터
새 webhook 트리거는 인바운드 POST마다 발화합니다. 단일 용도 URL에는 괜찮지만, 여러 이벤트 타입을 팬아웃하는 소스(GitHub가 대표적입니다 — 단일 저장소 webhook 하나가 `push`, `pull_request`, `workflow_run`, `check_suite` 등을 전달할 수 있습니다)에는 시끄럽습니다. webhook 트리거의 **이벤트 필터(Event filters)** 섹션을 사용하면 실제로 실행을 디스패치할 이벤트를 제한할 수 있으며, 그 외의 모든 것은 `status = ignored`, `reason = event_filtered`로 전달 히스토리에 기록되고 실행이나 이슈는 생성되지 않습니다.
각 행은 하나의 규칙입니다. **이벤트 이름(event name)**과 선택적으로 쉼표로 구분한 **action** 목록으로 구성됩니다. Multica는 **어느 한** 행이라도 일치하면 webhook을 허용합니다. 섹션을 비워 두면 모든 것을 수락합니다(필터링 이전 동작).
| `workflow_run` | _(비어 있음)_ | action에 상관없이 모든 `workflow_run` 이벤트 |
| `push` | _(비어 있음)_ | 모든 `push` 이벤트 |
#### 이벤트 이름과 action의 출처
Multica는 다음 순서로 인바운드 요청에서 `event` 이름과 `action`을 도출하며 — **첫 번째로 일치하는 것이 우선합니다**.
**1. 본문 봉투(Body envelope).** 본문이 문자열 `event` 필드를 가진 JSON 객체이면, 그 값이 곧 이벤트 이름입니다. 선택적인 `eventPayload` 객체는 자신의 `action` / `state` / `conclusion` / `status` 필드에서 action 후보를 제공합니다.
**4. 기본값(Default).** 위 어느 것도 일치하지 않으면, 이벤트는 `webhook.received`이고 action 후보는 없습니다.
**action 후보, 전체 목록.** 이벤트가 결정되면, Multica는 아래의 모든 값을 가능한 action 일치 대상으로 고려합니다.
- 이벤트가 `provider.event.<action>` 형태일 때의 이벤트 이름 접미사(예: `github.workflow_run.completed` → `completed`).
- 본문 필드 `action`, `state`, `conclusion`, `status` — **JSON 문자열일 때만 해당됩니다**. 불리언(`{"action": true}`)이나 숫자는 자격이 없으므로, `event=trigger, action=true`를 기대하는 필터는 `{"trigger": true}` 본문과 절대 일치하지 않습니다. `true`는 문자열이 아니라 bool이기 때문입니다.
**흔한 함정.** `Event name: trigger` / `Actions: true` 같은 필터 행은 "본문에 `trigger: true`가 있으면 발화하라"는 뜻이 **아닙니다** — 이벤트 필터는 임의의 본문 필드가 아니라 *추론된 이벤트와 action*에 일치시킵니다. 이를 적중시키려면 `X-Event-Type`로 `trigger.true`를 보내거나(또는 위에 표시된 본문 봉투를 사용하세요). 저장된 필터 행에서 주변 공백(`" workflow_run "`)은 그대로 저장되어 절대 일치하지 않으므로 — 저장하기 전에 trim하세요.
#### 빠른 테스트
필터를 구성한 후에는 `curl`로 두 분기를 모두 확인할 수 있습니다.
```bash
# Allowed — header drives event=workflow_run, body drives action=completed
- `{"status":"skipped","run_id":"…","reason":"agent runtime is offline at dispatch time"}` — 수신 대상의 런타임이 오프라인이며, `skipped` 실행으로 기록됩니다.
- `{"status":"ignored","reason":"trigger_disabled"}` — 트리거가 비활성화되어 있습니다.
- `{"status":"ignored","reason":"autopilot_paused"}` — 오토파일럿이 일시 정지되어 있습니다.
- `{"status":"ignored","reason":"autopilot_archived"}` — 오토파일럿이 보관되어 있습니다.
2xx가 아닌 응답은 실제 실패를 다룹니다.
- `400` — 유효하지 않은 JSON, 스칼라 본문, 빈 본문.
- `404` — 알 수 없는 토큰(`{"error":"webhook not found"}`).
- `413` — payload가 256 KiB를 초과했습니다.
- `429` — 토큰별 속도 제한 초과(기본값 60 req/min).
### 자체 호스팅: 공개 URL 구성하기
서버에 `MULTICA_PUBLIC_URL`이 설정되어 있으면(예: `https://multica.example.com`), 트리거 응답에 절대 경로 `webhook_url`이 포함되고 UI에는 복사 가능한 URL이 바로 표시됩니다. 설정하지 않으면 UI는 클라이언트의 API origin으로부터 URL을 구성합니다 — 데스크톱과 동일 출처 웹에는 괜찮지만, 커스텀 자체 호스팅 리버스 프록시에는 적합하지 않습니다. Multica는 잘못 구성된 리버스 프록시가 공격자가 제어하는 호스트를 가리키는 webhook URL을 서버가 발행하도록 속이지 못하게, `Host` / `X-Forwarded-Host` 헤더로부터 공개 호스트를 도출하지 않도록 의도적으로 설계되었습니다.
## 실행 히스토리 보기
모든 트리거는 **실행 레코드(run record)**를 생성하며, 오토파일럿 상세 페이지의 "History" 탭에서 볼 수 있습니다.
**오토파일럿 실패는 자동으로 재시도되지 않으며 인박스 알림도 보내지 않습니다.** 실패는 실행 히스토리에 `failed` 항목을 남길 뿐 — 할당이나 @-멘션처럼 시스템 수준에서 다시 대기열에 넣는 일도 없고, 누구에게도 알림이 가지 않습니다. 오토파일럿이 주기적이라면 **다음 cron 발화가 새 실행을 트리거**하지만, 실패한 작업이 자동으로 다시 실행되지는 않습니다.
오토파일럿이 중요하다면 직접 모니터링을 설계하세요 — 예를 들어 에이전트가 성공 시 댓글을 남기게 하고, 댓글이 누락된 것을 알아채 실패를 잡아내는 식입니다.
</Callout>
자동 재시도가 없는 이유: 오토파일럿은 이미 주기적이므로, 시스템 수준의 재시도를 추가하면 다음 예약된 실행 위에 겹쳐 중복 실행을 만들어 냅니다. 스케줄링을 전적으로 cron에 맡기면 깔끔하게 유지됩니다.
## 아직 제공되지 않는 기능
**API 종류의 트리거는 아직 연결되어 있지 않습니다.** 트리거 스키마는 `api` 종류를 예약해 두었지만, 그것을 발화하는 인그레스 라우트가 없습니다. UI는 기존 행에 Deprecated 배지를 표시하며 복사/교체 기능을 제공하지 않습니다. 트리거별 HMAC 서명 검증, IP 허용 목록, 제공자별 이벤트 프리셋은 후속 작업으로 추적되고 있으며, v1 URL은 bearer 방식만 지원합니다.
チャットは**プロバイダーセッションの再開**を通じて複数ターンのコンテキストを維持します — エージェントは最初の返信でプロバイダーセッションを確立し(例: Claude セッション)、そのセッション ID が保存されます。次のメッセージでは、タスクのディスパッチがその ID を渡し直すため、エージェントは毎回履歴を読み直すことなく**中断したところから再開**します。
もし**1 つのターンが失敗した**場合、Multica はセッション ID を確立していた以前のタスク(そのタスクが成功したか失敗したかにかかわらず)を探し、再開を試みます — 途中で一度失敗したからといって、会話全体の記憶が失われることはありません。
description: 어떤 이슈에도 속하지 않는, 에이전트와의 일대일 대화 — 완전히 샌드박스화되어 있습니다. 에이전트는 이슈를 보거나 변경할 수 없으며, 다른 누구도 이 대화를 볼 수 없습니다.
---
import { Callout } from "fumadocs-ui/components/callout";
**채팅은 당신과 [에이전트](/agents) 사이의 일대일 대화입니다** — [이슈](/issues) 보드에서 벗어나는 것입니다. 에이전트는 어떤 이슈도 보지 못하고 어떤 이슈도 변경할 수 없으며, 대화 전체가 **완전히 비공개**입니다([워크스페이스](/workspaces) 내의 다른 누구도, admin을 포함해서, 이 대화를 볼 수 없습니다). 에이전트와 접근 방식을 논의하거나, 브레인스토밍을 하거나, 어떤 이슈에도 속하지 않는 질문을 하기에 적합합니다.
## 그냥 에이전트를 @-멘션하면 안 되나요?
[@-멘션](/mentioning-agents)은 에이전트를 이슈의 컨텍스트 **안으로 끌어들입니다** — 에이전트는 이슈 설명과 모든 과거 댓글을 읽고, 이슈를 변경할 수 있습니다. 채팅은 이를 뒤집습니다. **당신을 이슈 밖으로 끌어냅니다** — 에이전트는 이 단일 대화만 볼 수 있고, 어떤 이슈의 존재도 인지하지 못하며, 이슈를 수정할 진입점도 없습니다.
두 가지 판단 기준:
- 특정 이슈의 컨텍스트에 기반한 피드백을 원할 때 → [@-멘션](/mentioning-agents)
- 어떤 이슈와도 무관한 주제를 논의하고 싶을 때(또는 다른 누구에게도 논의를 보이고 싶지 않을 때) → 채팅
## 대화 시작하기
사이드바에서 **채팅**을 열고, 에이전트를 선택한 다음, 새 대화를 시작하세요. 인터페이스는 여느 메시징 앱과 비슷합니다. 메시지를 보내면 에이전트가 답장합니다. 각 메시지는 백그라운드에서 실행을 트리거하므로(대기열에 들어간 `task`), 답장에는 몇 초가 걸릴 수 있습니다.
## 채팅에서 에이전트가 할 수 있는 일과 할 수 없는 일
에이전트는 대화 안에서 **완전히 샌드박스화된** 모드로 실행됩니다.
**할 수 있는 일:**
- 현재 메시지에 담긴 질문에 답하기
- 구성된 [스킬](/skills)과 MCP 사용하기
- 자신의 작업 디렉터리에서 파일 읽기 및 쓰기
- 이슈 컨텍스트가 필요 없는 `multica` CLI 명령 호출하기(예: 기본 워크스페이스 정보 조회)
**할 수 없는 일:**
- **어떤 이슈도 보기** — 에이전트가 받는 프롬프트에는 이슈 ID가 없으며, `multica issue list` 같은 명령은 빈 결과를 반환합니다
- **어떤 이슈도 변경하기** — 이슈 컨텍스트가 없으면 권한 검사에 의해 API 호출이 차단됩니다
- **다른 대화 보기** — 대화는 완전히 격리되어 있습니다
- **누구도, 어떤 에이전트도 @-멘션하기** — 채팅은 다른 사람에게 알릴 경로가 없는 비공개 공간입니다
## 여러 턴에 걸친 컨텍스트가 보존되는 방식
채팅은 **제공자 세션 재개**를 통해 여러 턴에 걸친 컨텍스트를 유지합니다 — 에이전트는 첫 답장에서 제공자 세션을 설정하고(예: Claude 세션), 그 세션 ID가 저장됩니다. 다음 메시지에서는 작업 디스패치가 그 ID를 다시 전달하므로, 에이전트는 매번 기록을 다시 읽지 않고도 **중단했던 지점부터 이어서 재개**합니다.
만약 **한 턴이 실패하면**, Multica는 세션 ID를 설정했던 이전 작업(그 작업이 성공했든 실패했든)을 찾아 재개를 시도합니다 — 중간에 한 번 실패한다고 해서 대화 전체의 기억이 사라지지는 않습니다.
참고: 모든 제공자가 실제로 세션 재개를 구현하는 것은 아닙니다 — 지원 현황은 [**제공자 매트릭스**](/providers)를 참고하세요.
## 대화 보관하기
더 이상 보고 싶지 않은 대화는 보관할 수 있습니다 — 대화 목록에서 우클릭하거나 상세 페이지의 "보관" 버튼을 사용하세요. 보관 후에는:
- 대화가 활성 목록에서 사라집니다("보관됨" 보기에서 여전히 찾을 수 있습니다)
- 과거 메시지, 세션 ID, 작업 디렉터리가 모두 보존됩니다 — 아무것도 삭제되지 않습니다
<Callout type="warning">
**보관 후에는 "복원" 버튼이 없습니다.** 현재 보관된 대화를 다시 활성 상태로 되돌리는 진입점이 없습니다. 나중에 그 스레드를 계속 이어가고 싶다면 새 대화를 시작해야 합니다. 보관된 대화의 내용을 다시 보려면 "보관됨" 보기를 열고 기록을 읽어 보세요.
</Callout>
## 다음
- [**오토파일럿**](/autopilots) — 에이전트가 일정에 따라 자동으로 작업을 시작하도록 하세요
- [**에이전트에게 이슈 할당하기**](/assigning-issues) — 주제를 이슈 보드로 다시 가져오세요
description: "모든 최상위 Multica CLI 명령어를 한 페이지로 정리한 개요입니다. 전체 사용법은 `multica <command> --help`를 실행하세요."
---
import { Callout } from "fumadocs-ui/components/callout";
Multica CLI는 Web UI에서 할 수 있는 거의 모든 작업을 그대로 제공합니다([이슈](/issues) 생성, [에이전트](/agents) 할당, [데몬](/daemon-runtimes) 시작 등). 이 페이지는 모든 최상위 명령어를 한 줄 설명과 함께 나열합니다. 전체 플래그와 예제는 `multica <command> --help`를 실행하세요.
## 인증하기
CLI를 처음 사용할 때 이 명령을 실행해 **개인 액세스 토큰(personal access token, PAT)**을 발급받으세요:
```bash
multica login
```
브라우저가 자동으로 열립니다. 웹 앱에서 승인하면 CLI가 PAT(`mul_` 접두사)를 `~/.multica/config.json`에 저장합니다. 이후 모든 명령은 이 PAT로 인증됩니다.
<Callout type="tip">
CI나 headless 환경에서는 브라우저 플로우를 건너뛰세요. 웹 앱의 **설정 → API 토큰**에서 PAT를 만든 뒤 `multica login --token <mul_...>`로 직접 전달하면 됩니다.
</Callout>
토큰 유형 간의 차이는 [인증과 토큰](/auth-tokens)을 참고하세요.
## 인증과 설정
| 명령어 | 용도 |
|---|---|
| `multica login` | 로그인하고 PAT 저장 |
| `multica auth status` | 현재 로그인 상태, 사용자, 워크스페이스 표시 |
| `multica workspace list` | 접근할 수 있는 모든 워크스페이스 나열 |
| `multica workspace get <slug>` | 워크스페이스 하나의 상세 정보 표시 |
| `multica workspace member list` | 현재 워크스페이스의 멤버 나열 |
| `multica workspace update <id> --name "..." [--description "..."] [--context "..."] [--issue-prefix "..."]` | 워크스페이스 메타데이터 업데이트(admin/owner). 긴 필드는 `--description-stdin` / `--context-stdin`을 사용할 수 있습니다. |
## 이슈와 프로젝트
<Callout type="info">
`list` 계열 명령(`multica issue list`, `autopilot list`, `project list` 등)은 기본적으로 짧고 **바로 복사해 붙여 넣을 수 있는** ID를 출력합니다. 이슈는 `MUL-123` 같은 이슈 키이고, 나머지 리소스는 짧은 UUID 접두사입니다. 아래 후속 명령의 `<id>` 인자는 짧은 ID와 전체 UUID를 모두 받으므로, 일반적인 흐름은 `multica issue list` → 키 복사 → `multica issue get MUL-123`입니다. 정식 UUID가 필요할 때는 `list` 명령에 `--full-id`를 전달하세요.
</Callout>
| 명령어 | 용도 |
|---|---|
| `multica issue list` | 이슈 나열(복사해 붙여 넣을 수 있는 이슈 키 출력) |
| `multica issue get <id>` | 단일 이슈 표시(이슈 키 또는 UUID를 받음) |
| `multica issue create --title "..."` | 새 이슈 생성 |
import { Callout } from "fumadocs-ui/components/callout";
이 페이지는 Multica Cloud를 처음부터 끝까지 안내합니다 — **가입 → [CLI](/cli) 설치 → [데몬](/daemon-runtimes) 시작 → [에이전트](/agents) 생성 → 첫 [작업](/tasks) 할당**. 약 5분이 걸립니다.
전제 조건은 하나뿐입니다: 로컬에 [AI 코딩 도구](/providers)([Antigravity](/providers#antigravity), [Claude Code](/providers#claude-code), [Codex](/providers#codex), [Cursor](/providers#cursor), [Copilot](/providers#copilot), [Gemini](/providers#gemini), [Hermes](/providers#hermes), [Kimi](/providers#kimi), [Kiro CLI](/providers#kiro-cli), [OpenCode](/providers#opencode), [OpenClaw](/providers#openclaw), [Pi](/providers#pi) 중 하나)를 이미 최소 하나는 설치해 두어야 합니다. 데몬은 시작할 때 이들을 자동으로 감지하며, 하나도 없으면 시작을 거부합니다.
## 1. 계정 만들기
[multica.ai](https://multica.ai)에서 가입하세요. 이메일(6자리 인증 코드) 또는 Google로 로그인할 수 있습니다.
가입 후에는 (계정 이름으로 생성된) 기본 워크스페이스에 자동으로 배치됩니다. 나중에 이름을 변경하거나 새 워크스페이스를 만들 수 있습니다.
- **이름** — 보드와 댓글에서 이 에이전트에 표시되는 이름입니다. 원하는 이름을 고르세요
- **제공자** — 로컬에 설치한 AI 코딩 도구를 선택하세요(드롭다운에는 런타임에서 감지된 도구만 나열됩니다)
- **모델**(선택) — 해당 도구 내부의 모델 선택(제공자에 따라 정적 목록 또는 동적 발견)
- **지침**(선택) — 이 에이전트를 위한 시스템 프롬프트
생성되면 에이전트는 워크스페이스 멤버 목록에 나타나며, 사람 멤버처럼 작업을 할당할 수 있습니다.
## 6. 첫 작업 할당하기
웹 UI에서 이슈를 만들거나, CLI에서 만드세요:
```bash
multica issue create --title "Add an ASCII architecture diagram to the README"
```
방금 만든 에이전트에게 이슈를 할당하세요 — 웹 UI에서 아바타를 클릭하거나, CLI를 사용하세요:
```bash
multica issue assign MUL-1 --to my-agent-name
```
`--to`는 에이전트 또는 멤버의 **이름**을 받습니다. 부분 문자열 일치도 동작합니다 — 에이전트 이름이 `my-code-reviewer`라면 `reviewer`로 해석됩니다. 워크스페이스에 이름이 겹치는 경우, 대신 `--to-id <uuid>`(`--to`와 상호 배타적)를 전달하세요. UUID는 `multica agent list --output json` 또는 `multica workspace member list --output json`으로 조회하세요.
**다음으로 데몬에서 일어나는 일**:
1. 3초 이내에 작업을 가져갑니다(상태가 `queued`에서 `dispatched`로 바뀝니다)
2. 일치하는 AI 코딩 도구를 호출하여 작업을 시작합니다(상태가 `running`이 됩니다)
3. AI가 로컬에서 작업합니다 — 코드 디렉터리를 읽고, 명령을 실행하고, 파일을 편집할 수 있습니다
4. 완료되면 결과를 Multica로 다시 보고합니다(자동 재시도 동작 여부에 따라 상태가 `completed` 또는 `failed`가 됩니다)
웹 UI는 **실시간으로**(WebSocket을 통해) 업데이트됩니다 — 새로 고침이 필요 없습니다.
## 다음 단계
- [데몬과 런타임](/daemon-runtimes) — 데몬이 어떻게 동작하는지와 런타임의 의미
- [작업](/tasks) — 작업 생명주기와 재시도 규칙
- [AI 코딩 도구 비교](/providers) — 12개 도구 간의 기능 차이
- [데스크톱 앱](/desktop-app) — 데몬을 직접 실행하고 싶지 않은 경우
- [자체 호스팅 빠른 시작](/self-host-quickstart) — 자체 백엔드 실행하기
@@ -7,7 +7,7 @@ import { Callout } from "fumadocs-ui/components/callout";
This page walks you end-to-end through Multica Cloud — **sign up → install the [CLI](/cli) → start the [daemon](/daemon-runtimes) → create an [agent](/agents) → assign your first [task](/tasks)**. Takes about 5 minutes.
One prerequisite: you already have at least one [AI coding tool](/providers) installed locally ([Claude Code](/providers#claude-code), [Codex](/providers#codex), [Cursor](/providers#cursor), [Copilot](/providers#copilot), [Gemini](/providers#gemini), [Hermes](/providers#hermes), [Kimi](/providers#kimi), [Kiro CLI](/providers#kiro-cli), [OpenCode](/providers#opencode), [OpenClaw](/providers#openclaw), or [Pi](/providers#pi)). The daemon auto-detects them on startup and refuses to start if none are present.
One prerequisite: you already have at least one [AI coding tool](/providers) installed locally ([Antigravity](/providers#antigravity), [Claude Code](/providers#claude-code), [Codex](/providers#codex), [Cursor](/providers#cursor), [Copilot](/providers#copilot), [Gemini](/providers#gemini), [Hermes](/providers#hermes), [Kimi](/providers#kimi), [Kiro CLI](/providers#kiro-cli), [OpenCode](/providers#opencode), [OpenClaw](/providers#openclaw), or [Pi](/providers#pi)). The daemon auto-detects them on startup and refuses to start if none are present.
## 1. Create an account
@@ -114,6 +114,6 @@ The web UI updates in **real time** (via WebSocket) — no refresh needed.
- [Daemon and runtimes](/daemon-runtimes) — how the daemon operates and what runtimes mean
- [Tasks](/tasks) — task lifecycle and retry rules
- [AI coding tools compared](/providers) — capability differences across the 11 tools
- [AI coding tools compared](/providers) — capability differences across the 12 tools
- [Desktop app](/desktop-app) — if you'd rather not run the daemon yourself
- [Self-host quickstart](/self-host-quickstart) — run your own backend
description: 이슈 아래에서 협업하기 — 댓글, 답글, `@` 멘션, 반응, 그리고 댓글에서 에이전트를 트리거하기.
---
import { Callout } from "fumadocs-ui/components/callout";
모든 [이슈](/issues)에는 댓글 스레드가 있습니다. 댓글을 작성하고, 다른 사람에게 답글을 달고, [멤버](/members-roles)나 [에이전트](/agents)를 `@`로 멘션하고, 반응을 추가하세요 — 지금까지 써온 어떤 작업 관리 도구에서나 하던 것과 똑같은 동작입니다. 단 한 가지 다른 점은: **`@`로 에이전트를 멘션하면 에이전트가 작업을 시작하도록 트리거된다**는 것입니다.
## 댓글 작성하기
이슈 상세 페이지 하단의 입력란에 내용을 입력하고 **보내기**를 누르세요. 댓글이 스레드에 즉시 나타납니다. 댓글은 Markdown을 지원합니다 — 제목, 목록, 코드 블록, 링크를 모두 사용할 수 있습니다.
## 댓글에 답글 달기
어떤 댓글이든 오른쪽 상단의 **답글**을 클릭하면 그 아래에 중첩된 입력란이 열립니다. 작성한 답글은 해당 댓글의 하위 항목으로 표시되어 대화 스레드를 이룹니다. 답글에도 다시 답글을 달 수 있으며, 필요한 만큼 깊이 중첩됩니다.
이슈 목록에는 최상위 댓글 수만 표시되며, 이슈를 열어야 전체 대화 트리를 볼 수 있습니다.
## 반응
각 댓글의 오른쪽 상단에는 빠른 신호를 보내기 위한 반응 버튼이 있습니다(👍, 👀, 🎉) — 동의를 표시하려고 "+1" 댓글을 따로 달 필요가 없습니다.
## `@` 멘션
댓글에 `@`를 입력하면 선택 창이 열립니다. 멤버나 에이전트를 고르면 `@`와 대상의 slug가 함께 삽입됩니다(`@alice` 또는 `@reviewer-bot`). 멘션된 대상은 자신의 [인박스](/inbox)에서 알림을 받습니다.
**에이전트를 멘션하면 자동으로 트리거됩니다** — [댓글에서 에이전트 멘션하기](/mentioning-agents)를 참고하세요.
하나의 댓글에서 같은 사람을 여러 번 멘션해도 알림은 **하나만** 발생합니다.
### `@all`은 워크스페이스 전체에 알립니다
`@all`은 특별한 대상입니다. 워크스페이스의 모든 멤버에게 알림을 보냅니다. 사람과 에이전트 모두 `@all`을 사용할 수 있습니다 — 즉 진행 상황을 보고하는 에이전트도 `@all`을 할 수 있으므로, 에이전트의 지침에 이를 아껴서 사용하라고 일러두세요.
<Callout type="warning">
**`@all`은 신중하게 사용하세요.** 규모가 큰 워크스페이스에서는 단 한 번의 `@all`이 그만큼의 인박스 알림을 순식간에 생성합니다. 모두가 정말로 알아야 하는 일에만 사용하고 — 일상적인 업데이트에는 쓰지 마세요.
</Callout>
## 이슈 참조하기
다른 이슈를 링크하려면 `MUL-123`처럼 이슈 키를 입력하세요. Multica는 댓글에서 실제 존재하는 이슈 키를 해석하여 내부적으로 `mention://issue/<uuid>` 링크로 저장합니다. 이슈 링크는 단순한 상호 참조일 뿐입니다. 사람에게 알림을 보내지 않으며 에이전트를 트리거하지도 않습니다.
보통은 `[MUL-123](mention://issue/<uuid>)`을 직접 손으로 작성할 필요가 없습니다. 그 형식은 Multica가 키를 해석한 뒤에 사용하는 표준 내부 표현입니다.
<Callout type="info">
Markdown 강조는 CommonMark 규칙을 따릅니다. 굵은 텍스트가 문장 부호나 닫는 따옴표로 끝나고 그 뒤에 한국어 조사가 바로 이어지면, 닫는 `**`가 인식되지 않을 수 있습니다.
따옴표를 굵게 표시 범위 밖으로 옮기는 것을 권장합니다:
```markdown
"**무엇을 먼저 정해두고 시작할지**"가
```
다음 대신:
```markdown
**"무엇을 먼저 정해두고 시작할지"**가
```
</Callout>
## 댓글 편집과 삭제
댓글은 작성자만 편집하거나 삭제할 수 있습니다.
댓글을 삭제하면 그 아래의 **모든 답글도 함께 삭제됩니다**(답글의 답글 포함). 내용만 바꾸려면 삭제 대신 편집을 사용하세요.
<Callout type="warning">
**댓글을 편집하면서 `@`를 추가해도 에이전트는 트리거되지 않습니다.** 트리거는 댓글이 **생성되는** 그 순간에 발생합니다 — 나중에 편집해서 새로운 `@`를 추가하거나 대상을 바꿔도 새 알림이 발송되지 않고 에이전트가 깨어나지도 않습니다. 놓친 에이전트를 불러오려면 그 에이전트를 `@`하는 **새 댓글을 작성**하세요.
</Callout>
---
여기까지 다룬 내용은 모두 "사람의 세계"입니다 — 워크스페이스, 멤버, 이슈, 프로젝트, 댓글. Linear나 Jira를 써본 적이 있다면 지금까지의 내용은 전혀 낯설지 않을 것입니다.
하지만 Multica의 결정적인 특징은 아직 등장하지 않았습니다: **에이전트를 워크스페이스의 일급 멤버로 대우하는 것**입니다. 다음 장에서 바로 이 이야기를 다룹니다.
@@ -37,6 +37,28 @@ Mentioning the same person multiple times in one comment still produces **only o
**Use `@all` carefully.** In a larger workspace, a single `@all` generates that many inbox notifications instantly. Reserve it for things everyone genuinely needs to know — not day-to-day updates.
</Callout>
## Referencing issues
To link another issue, type its issue key, such as `MUL-123`. Multica resolves real issue keys in comments and stores them as an internal `mention://issue/<uuid>` link. Issue links are cross-references only: they do not notify people and they do not trigger agents.
You normally do not need to write `[MUL-123](mention://issue/<uuid>)` by hand. That format is the canonical internal representation after Multica has resolved the key.
<Callout type="info">
Markdown emphasis follows CommonMark rules. When bold text ends with punctuation or a closing quote and is immediately followed by a Korean particle, the closing `**` may not be recognized.
Prefer moving the quote outside the bold span:
```markdown
"**무엇을 먼저 정해두고 시작할지**"가
```
instead of:
```markdown
**"무엇을 먼저 정해두고 시작할지"**가
```
</Callout>
## Editing and deleting a comment
Only the author of a comment can edit or delete it.
import { Callout } from "fumadocs-ui/components/callout";
import { Mermaid } from "@/components/mermaid";
Multica에서 [에이전트](/agents)는 우리 서버에서 실행되지 **않습니다**. 로컬에 설치된 [AI 코딩 도구](/providers)를 호출하는 **데몬**이라는 작은 프로그램이 구동하여, 여러분 자신의 기기에서 실행됩니다. Multica 서버는 조율만 담당합니다. [이슈](/issues)를 저장하고, [작업](/tasks)을 대기열에 넣고, 알맞은 **런타임**으로 분배합니다(런타임 = 데몬 × AI 코딩 도구 하나).
이 구조가 Multica와 Linear / Jira의 가장 큰 차이점입니다. **여러분의 API 키, 툴체인, 코드 디렉터리는 모두 여러분의 기기에 남아 있으며**, Multica 서버는 그중 어느 것도 보지 못합니다. 따라서 "내 에이전트가 동작하지 않는다"는 거의 항상 로컬 문제입니다. 데몬이 실행 중이 아니거나, AI 도구가 설치되어 있지 않거나, 키가 만료되었을 수 있습니다. 먼저 로컬을 확인하세요. 안내는 [문제 해결](/troubleshooting)을 참고하세요.
## 데몬 시작하기
데몬은 Multica CLI의 일부입니다. [Multica CLI](/cli)를 설치한 뒤, 여러분의 기기에서 실행하세요.
```bash
multica daemon start
```
시작 시 데몬은 네 가지 일을 합니다.
1. 로그인할 때 저장된 인증 정보를 읽습니다
2. `PATH`에 설치된 AI 코딩 도구를 감지합니다(내장 12종: [Antigravity](/providers#antigravity), [Claude Code](/providers#claude-code), [Codex](/providers#codex), [Cursor](/providers#cursor), [Copilot](/providers#copilot), [Gemini](/providers#gemini), [Hermes](/providers#hermes), [Kimi](/providers#kimi), [Kiro CLI](/providers#kiro-cli), [OpenCode](/providers#opencode), [OpenClaw](/providers#openclaw), [Pi](/providers#pi))
3. 감지된 각 도구에 대한 런타임과 함께 자신을 서버에 등록합니다
4. **3초마다** 가져올 작업이 있는지 폴링하고, **15초마다 하트비트를 전송**합니다
**데스크톱 앱에는 데몬이 포함되어 있습니다.** [데스크톱 앱](/desktop-app)을 사용한다면 `multica daemon start`를 수동으로 실행할 필요가 없습니다. 시작할 때 데몬을 자동으로 띄웁니다. 여러분의 워크플로에 어떤 방식이 맞는지는 [데스크톱 앱](/desktop-app) 페이지를 참고하세요.
## 한 기기에 여러 런타임이 생기는 이유
런타임은 서버도 아니고 컨테이너도 아닙니다. "**데몬 × AI 코딩 도구 하나**"의 조합입니다. 예를 들어, Claude Code와 Codex가 모두 설치된 MacBook에서 데몬을 시작하고, 여러분이 두 개의 워크스페이스 멤버라고 합시다. 그러면 Multica는 4개의 런타임을 등록합니다.
<Mermaid chart={`
graph TD
D["여러분의 데몬<br/>MacBook"]
D --> R1["런타임<br/>워크스페이스 A × Claude Code"]
D --> R2["런타임<br/>워크스페이스 A × Codex"]
D --> R3["런타임<br/>워크스페이스 B × Claude Code"]
D --> R4["런타임<br/>워크스페이스 B × Codex"]
`} />
핵심 포인트:
- **하나의 데몬은 여러 런타임에 매핑될 수 있습니다.** 설치된 도구와 가입한 워크스페이스의 조합마다 하나씩 생깁니다
- **같은 데몬, 워크스페이스, 도구는 정확히 하나의 런타임을 만듭니다.** 데몬을 재시작해도 중복 레코드가 생기지 않습니다
- Multica UI의 **런타임** 페이지가 이 행들을 나열합니다
<Callout type="info">
**클라우드 런타임이 곧 제공됩니다.** 현재는 대기자 명단 단계입니다. 제공이 시작되면 로컬 데몬을 실행하지 않고도 Multica Cloud에서 직접 에이전트 작업을 실행할 수 있습니다. [다운로드 페이지](https://multica.ai/download)에서 이메일로 등록하면 알림을 받을 수 있습니다.
</Callout>
## 런타임이 오프라인으로 표시되는 시점
Multica는 하트비트로 런타임이 온라인인지 판단합니다. 세 가지 핵심 수치:
| 이벤트 | 임곗값 |
|---|---|
| 데몬 하트비트 빈도 | **15초**마다 |
| 누락으로 표시 | **45초** 동안 하트비트 없음(3회 누락) |
| 자동 삭제 | 연결된 에이전트 없이 **7일** 넘게 누락 상태 |
누락은 영구적이지 않습니다. 데몬이 다시 하트비트를 보내는 즉시 온라인으로 돌아오며, 런타임 레코드도 보존됩니다. 데몬을 재시작해도 런타임은 사라지지 않습니다.
<Callout type="warning">
**누락된 런타임에서 실행 중이던 작업은 실패로 표시됩니다**(실패 사유 `runtime_offline`). 재시도 가능한 출처(이슈, 채팅)에 대해서는 Multica가 자동으로 다시 대기열에 넣습니다. 오토파일럿이 트리거한 작업은 자동으로 재시도되지 않습니다. [작업 → 어떤 실패가 자동 재시도되는지](/tasks#which-failures-retry-automatically-which-dont)를 참고하세요.
</Callout>
## 동시에 실행할 수 있는 작업 수
Multica는 두 계층에서 동시성 제한을 적용합니다.
- **데몬 계층**: 기본 **동시 작업 20개**(환경 변수 `MULTICA_DAEMON_MAX_CONCURRENT_TASKS`로 조정 가능)
- **에이전트 계층**: 기본 **에이전트당 동시 작업 6개**(에이전트별로 설정)
둘 중 더 엄격한 쪽이 적용됩니다. 데몬이 이미 작업 20개를 실행 중이라면, 어떤 에이전트에 여유가 남아 있어도 새 작업은 대기합니다.
작업이 `dispatched`로 넘어가지 못하고 `queued`에 멈춰 있다면, 보통 이 두 제한 중 하나가 포화 상태인 것입니다.
## 데몬 충돌 후 진행 중이던 작업은 어떻게 되나
데몬이 충돌하거나 강제 종료되면, 데몬이 가져갔던 작업은 `dispatched` 또는 `running` 상태에 남습니다. 다음 시작 시 데몬은 서버에 "이 작업들은 더 이상 제 것이 아니니, 실패로 표시해 주세요"라고 알립니다. 서버는 이를 사유 `runtime_recovery`와 함께 `failed`로 전환합니다. 재시도 가능한 출처에 대해서는 작업이 자동으로 다시 대기열에 들어갑니다.
이 단계가 네트워크 문제로 실패하더라도, 백업으로 **30초마다** 서버 측 스캔이 돕니다. 45초 넘게 하트비트가 없는 런타임은 누락으로 표시되며, 그 위의 작업도 함께 회수됩니다.
## 동작하지 않는 에이전트 문제 해결
"내 에이전트가 동작하지 않는다"는 문제를 만나면, 먼저 이 세 단계 체크리스트를 진행하세요.
1. `multica daemon status`를 실행해 데몬이 실행 중이고 온라인인지 확인하세요
2. `multica daemon logs -f`를 실행해 오류가 있는지 확인하세요
3. Multica UI의 **런타임** 페이지를 열어 런타임이 "온라인"으로 표시되는지 확인하세요
例: ワークスペース A でイシューのタブを 3 つ開いた状態でワークスペース B に切り替えます。A のタブ 3 つは消え、B には B で最後に開いていたものが表示されます。再び A に切り替えると、その 3 つのタブが以前の状態そのままに戻ってきます。**タブはワークスペース間で決して漏れ出しません。**
description: Multica Desktop이 무엇인지, 웹 앱과 어떻게 다른지, 언제 쓸 만한지 알아봅니다.
---
import { Callout } from "fumadocs-ui/components/callout";
Multica Desktop은 macOS, Windows, Linux용 네이티브 데스크톱 앱입니다. 구성된 환경에 대해 웹 앱과 동일한 백엔드에 연결되며 동일한 데이터를 보여줍니다. 기본적으로 Desktop은 Multica Cloud를 사용하며, 자체 호스팅 인스턴스는 로컬 런타임 설정 파일로 구성할 수 있습니다. Desktop은 브라우저가 할 수 없는 몇 가지 기능도 추가로 제공합니다: **[워크스페이스](/workspaces)별 독립적인 탭 그룹**, **[데몬](/daemon-runtimes) 자동 시작**, **원클릭 업그레이드**입니다.
## Desktop 또는 웹 — 무엇을 선택할까
| | 웹 | Desktop |
|---|---|---|
| 접근 방식 | 브라우저에서 URL 열기 | 네이티브 앱 설치 |
| 다중 탭 | 브라우저 자체 탭 (워크스페이스 구분 없음) | **워크스페이스별 독립 탭 그룹 한 개** |
| 데몬 | `multica daemon start`를 직접 실행 | 실행 시 **자동으로 시작** |
| 업그레이드 | 새로고침하면 최신 버전 | 앱이 백그라운드에서 확인하고 다음 실행 시 설치 |
| 로그인 후 데이터 | 동일 | 동일 |
**웹을 선택**: 일회성으로 사용하거나, 다른 사람의 기기에서 작업하거나, 아무것도 설치하고 싶지 않을 때.
**Desktop을 선택**: 매일 사용하거나, 여러 워크스페이스를 동시에 다루거나, 데몬을 수동으로 관리하고 싶지 않을 때.
## 다중 탭: 워크스페이스를 전환하면 어떻게 되나
Desktop은 **참여한 모든 워크스페이스**마다 독립적인 탭 그룹을 유지합니다. 워크스페이스를 전환하면 현재 워크스페이스의 탭이 하나의 단위로 숨겨지고, 이전 워크스페이스의 탭은 떠났던 그대로 복원됩니다 — VSCode의 멀티 워크스페이스 동작이나 Slack에서 워크스페이스를 전환하는 것과 비슷합니다.
예시: 워크스페이스 A에서 이슈 탭 3개를 열어 둔 상태로 워크스페이스 B로 전환합니다. A의 탭 3개는 사라지고, B에는 B에서 마지막으로 열어 두었던 것이 표시됩니다. 다시 A로 전환하면 그 3개의 탭이 정확히 이전 상태 그대로 돌아옵니다. **탭은 워크스페이스 간에 절대 새어 나가지 않습니다.**
로그아웃하면 **모든 워크스페이스의 탭 상태가 지워지므로**, 여러 사용자가 기기를 공유하더라도 데이터가 새어 나가지 않습니다.
## Desktop의 자동 업데이트 방식
실행 시 Desktop은 GitHub Releases에서 더 최신 버전이 있는지 확인합니다. 새 버전이 발견되면:
1. 백그라운드에서 새 버전을 조용히 다운로드합니다.
2. "준비 완료 — 다음 실행 시 설치됩니다"라고 알려 줍니다.
3. 종료(또는 다음 재시작) 시, 앱이 닫히기 전에 업데이트를 설치합니다.
4. 다음 실행 시 새 버전이 실행됩니다.
이 전체 과정은 **작업 중인 내용을 방해하지 않습니다**.
<Callout type="warning">
**Windows에서는 ARM64와 x64가 별도의 업데이트 채널입니다** — 잘못된 아키텍처를 설치하면 업데이트가 감지되지 않습니다. 다운로드할 때 기기에 맞는 `.exe`를 선택하세요 (ARM 빌드에는 `arm64` 접미사가 붙어 있습니다).
</Callout>
macOS 빌드는 서명 및 공증되어 있으므로, 첫 실행 시 "확인되지 않은 개발자" 경고가 표시되지 않습니다. Linux 빌드는 `.AppImage`입니다 — 자동 업데이트는 electron-updater에 의존하는데, 일부 배포판에서는 불안정할 수 있습니다. **자동 업데이트가 작동하지 않으면, 새 버전을 수동으로 다운로드하여 기존 파일을 교체하세요.**
## 별도의 CLI와 데몬이 여전히 필요한가요?
**아닙니다.** Desktop에는 동일한 `multica` CLI 바이너리가 내장되어 있으며, 시작 시 자체 데몬 프로필을 실행합니다 (터미널에서 수동으로 실행 중인 데몬과는 격리됩니다).
이미 CLI를 설치하고 `multica daemon start`를 직접 실행했더라도, Desktop은 그 데몬을 가로채지 않습니다 — 별도의 프로필로 자체 데몬을 시작합니다. 둘은 **서로 다른 런타임**으로 등록되며, UI에서 두 개의 독립된 런타임을 볼 수 있습니다.
터미널에서 CLI 명령을 실행하고 싶다면, Desktop이 특별한 경로를 제공하지는 않습니다 — 별도로 설치한 CLI를 사용하거나, 앱의 리소스 디렉터리 안 `resources/bin/multica`에 있는 번들 사본을 실행하세요.
## 다운로드 및 설치
[Multica 다운로드 페이지](https://multica.ai/download)에서 사용하는 플랫폼의 설치 프로그램을 받으세요:
| 플랫폼 | 파일 |
|---|---|
| macOS (Intel 또는 Apple Silicon) | `.dmg` |
| Windows x64 | `.exe` (표준) |
| Windows ARM64 | `.exe` (`arm64` 접미사 포함) |
| Linux | `.AppImage` |
첫 실행 시 로그인이 필요합니다 — 웹 앱과 동일한 이메일 + 인증 코드 흐름입니다. 로그인하면 Desktop이 워크스페이스 목록을 자동으로 동기화합니다.
<Callout type="info">
**Desktop은 기본적으로 Multica Cloud를 사용하지만, 로컬 설정 파일로 자체 호스팅 인스턴스를 가리키도록 할 수 있습니다.** 앱 안에는 여전히 "자체 호스팅에 연결" 선택기가 없습니다. Desktop은 렌더러가 시작되기 전에 `~/.multica/desktop.json`을 읽습니다. 파일이 없으면 Cloud 기본값을 사용합니다.
최소 자체 호스팅 설정:
```json
{
"schemaVersion": 1,
"apiUrl": "https://api.your-domain"
}
```
`apiUrl`은 필수이며 `http` 또는 `https`를 사용해야 합니다. Desktop은 동일 오리진에서 `wsUrl`을 `/ws`로 유도하고(`https`이면 `wss`, `http`이면 `ws`), API 오리진에서 `appUrl`을 유도합니다. 배포 환경이 서로 다른 오리진을 사용한다면 명시적으로 설정하세요:
```json
{
"schemaVersion": 1,
"apiUrl": "https://api.your-domain",
"wsUrl": "wss://api.your-domain/ws",
"appUrl": "https://your-domain"
}
```
`desktop.json`이 존재하지만 유효하지 않으면, Desktop은 안전하게 동작을 차단하여 Cloud로 조용히 되돌아가는 대신 차단형 설정 오류를 표시합니다. 개발 빌드의 경우, `electron-vite dev` 중에는 여전히 `VITE_API_URL` / `VITE_WS_URL` / `VITE_APP_URL`이 우선합니다. 런타임 Desktop 자체 호스팅 구성은 [issue #1371](https://github.com/multica-ai/multica/issues/1371)에서 구현되었습니다.
</Callout>
## 다음 단계
- [Cloud Quickstart](/cloud-quickstart) — Desktop을 위한 Cloud 온보딩 흐름
- [Self-Host Quickstart](/self-host-quickstart) — 자체 백엔드를 실행하고 CLI 또는 Desktop 런타임 설정으로 연결하기
- [데몬 및 런타임](/daemon-runtimes) — 데몬의 작동 방식 (Desktop이 대신 시작해 주지만 동작은 동일합니다)
常に `/{slug}/{section}` の下に置きます — `/{slug}/issues`、`/{slug}/agents`、`/{slug}/settings`。ワークスペースのルーティングロジックを絶対に重複させず、共有コードではフレームワーク固有の link API ではなく `useNavigation().push()` を使用してください。
루트에 하이픈으로 연결된 단어 묶음은 사용자가 직접 고른 워크스페이스 slug와 충돌하며, 끝없는 예약어 slug 점검을 강요합니다. 명사(`workspaces`)를 예약해 두면 `/workspaces/*` 하위 트리 전체가 자동으로 보호됩니다.
### 워크스페이스 범위 라우트
항상 `/{slug}/{section}` 아래에 둡니다 — `/{slug}/issues`, `/{slug}/agents`, `/{slug}/settings`. 워크스페이스 라우팅 로직을 절대 중복하지 말고, 공유 코드에서는 프레임워크별 link API 대신 `useNavigation().push()`를 사용하세요.
- 불리언: `is_<state>` 또는 `<state>_at` (상태 변경에는 타임스탬프 형태 선호)
- 마이그레이션 파일: `NNN_descriptive_name.up.sql` + `.down.sql` — 항상 양방향을 모두 제공
### Go
- 표준 `gofmt` + `go vet`. 예외 없음.
- Handler 파일은 도메인을 반영: `agent.go`, `auth.go`, `runtime.go`
- 테스트: `<file>_test.go`로 같은 위치에 배치
- handler에서의 UUID 파싱은 루트 `CLAUDE.md`의 규칙을 따릅니다 — 경계 입력에는 `parseUUIDOrBadRequest`, 신뢰할 수 있는 왕복에는 `parseUUID`(panic 버전), 절대 error를 확인하지 않고 `util.ParseUUID`를 직접 사용하지 마세요.
### TypeScript
- 네트워크상의 API 응답은 `snake_case`이며, api client가 경계에서 `camelCase`로 변환합니다. TS 코드 내부에서는 **항상 camelCase**.
- 열거형: string literal union을 선호하고, 런타임 순회가 필요한 경우에만 `enum`을 사용.
- TanStack Query 키: `<feature>/queries.ts` 안의 팩토리 함수, 예: `issueKeys.detail(id)`.
### 이슈 키
모든 이슈에는 `MUL-123` 같은 사람이 읽을 수 있는 키가 있습니다: 워크스페이스 `issue_prefix`(대문자와 숫자, 보통 3자, 최대 10자) + 일련번호. 워크스페이스 admin은 Settings → General에서 접두사를 변경할 수 있는데, 변경하면 기존의 모든 이슈가 다시 번호 매겨지므로 옛 접두사가 박혀 있는 외부 참조(PR 제목, 브랜치 이름, 문서와 채팅 속 링크)는 더 이상 연결되지 않습니다.
### 코드 내 주석
영어만 사용합니다. 레포는 Go와 TypeScript 모두에 이를 강제합니다. 코드에서 중국어 주석을 발견하면 그것은 버그이니 교체하세요.
이것은 모든 번역 PR이 **반드시** 지켜야 하는 용어집입니다. 예전에는 `packages/views/locales/glossary.md`에 있었는데, 그 파일은 이제 이곳을 가리키는 stub입니다.
### 핵심 구분: 엔티티 vs 개념
Multica의 제품 명사는 두 가지 범주로 나뉩니다:
- **엔티티(Entity)** — URL, 데이터베이스 row, API 타입을 가집니다. 중국어 텍스트에서는 **소문자 영문**으로 표기하여 시각적으로 타입 이름처럼 읽히게 하고 "이것은 Multica 시스템 엔티티다"라는 신호를 줍니다.
- **개념(Concept)** — 일반 명사이며, 데이터베이스 엔티티가 아닙니다. **완전히 번역**하여 중국어 사용자가 흐르는 텍스트 속에 들쭉날쭉한 영문을 보지 않도록 합니다.
이 규칙은 `apps/docs/content/docs/*.zh.mdx`와 정렬되어 있습니다 — 이 문서들은 사실상의 중국어 보이스 표준이며 20개 이상의 페이지에서 실전 검증되었습니다.
### 엔티티 — 혼합 규칙 (`issue` / `skill` / `task`)
`issue` / `skill` / `task`는 Multica의 핵심 엔티티입니다. 스키마 컬럼, API 필드, 제품 UI 레이블이 모두 영어입니다. 중국어 텍스트에서는 **혼합 규칙**을 따르며 — 무엇을 사용할지는 단어가 어디에 나타나는지에 따라 달라집니다:
| 맥락 | 표기 | 예시 |
| --- | --- | --- |
| **UI 문자열, 상태 이름, 코드 참조** | 소문자 영문 | "排队中的 task"、"创建子 issue"、"为智能体注入 skill" |
| **문서 제목 / 섹션 헤딩** | Title-case 영문 **또는** 중국어 용어 | "Issue 与 project"、"Skills"、"执行任务" |
| **장문 문서 산문에서 엔티티가 진행 중인 주어일 때** | 중국어 용어, 첫 언급 시 괄호 안에 영문 | "**执行任务**(task)是智能体每一次工作的单位" |
| **API / DB 필드** | 항상 `task` / `issue` / `skill` | `task_id`, `issue_status`, `skill_uuid` |
중국어 용어 참고:
- `task` ↔ `执行任务` (맥락이 분명해지면 `任务`로 줄여서 사용)
- `issue`에는 정착된 중국어 번역이 없습니다 — 영문 유지; 제목에서는 `Issue`처럼 대문자로 표기 가능
- `skill`에는 정착된 중국어 번역이 없습니다 — 영문 유지; 제목에서는 `Skills`처럼 대문자로 표기 가능
**`issue` / `skill` / `task`가 `project` / `autopilot`처럼 중국어로 강제 번역되지 않는 이유**:
- **`issue` / `task`**: 개발 팀은 영어로 대화합니다. 중국어 후보들("任务" — 너무 모호하고 "工作"과 거의 동의어; "工单" — IT 티켓 어감; "议题" — GitHub 스타일이지만 제품 느낌과 맞지 않음)은 모두 `issue`보다 못하게 읽힙니다. **하지만** 장문 문서 산문에서 소문자 `task`를 50번 반복하면 리듬이 깨지므로 — 산문에서는 `执行任务`를 허용하되, UI 문자열과 상태 이름은 소문자 영문으로 유지합니다.
- **`skill`**: 정착된 중국어 용어가 없는 Multica 고유 개념입니다.
- **`project` → "项目"**: 정착된 주류 중국어 단어. Feishu / Tower / Teambition / PingCode / GitHub Projects — 모든 중국어 제품이 이를 번역합니다. 중국어 맥락에서 `project`를 그대로 두는 제품은 없습니다.
- **`autopilot` → "自动化"**: 중국어에서 "autopilot"은 Tesla의 "自动驾驶"를 연상시키며, 이 기능이 하는 일(일정에 따라 task 실행)과 맞지 않습니다. Notion과 Feishu 모두 "自动化"를 사용하며, 그것이 업계 합의입니다.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.