mirror of
https://github.com/multica-ai/multica.git
synced 2026-07-05 21:39:54 +02:00
fix/pi-empty-model-list
763 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5eba94ee25 |
feat(lark): inbound context enrichment — post / merge_forward / quoted-reply (MUL-2951) (#3724)
Expand an inbound Lark bot message's body before dispatch with the context
a user explicitly attached, so the agent sees a semantically complete
conversation instead of a bare "@bot 总结一下".
- post: flatten rich-text (title + paragraphs, links, @-mentions) to plain
text synchronously in the decoder.
- merge_forward: inline the forwarded transcript via a single GetMessage —
GET /open-apis/im/v1/messages/{id} returns the forward sentinel plus the
bundled children. (The issue's container_id_type=merge_forward query is
undocumented; this avoids it and also handles a forwarded quoted parent.)
- quoted reply: prepend the parent_id message as a <quoted_message> block;
a parent that is itself a forward nests a <forwarded_messages> block.
- new InboundEnricher runs in the WS connector between decode and emit,
bounded by EnrichTimeout and degrading to "[unable to fetch]" placeholders
so it never blocks the ~3s long-conn ACK budget.
/issue stays parseable on a quote-reply by parsing the command from the
user's own text (CommandBody) rather than the enriched body.
Short-window debounce batching (issue item #4) is tracked as a follow-up.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
|
||
|
|
945d684afd |
MUL-2965 fix(metrics): harden business sampler quality (#3738)
* fix(metrics): harden business sampler quality (MUL-2965) Co-authored-by: multica-agent <github@multica.ai> * fix(metrics): alert on sampler acquire failures Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
b92c3fbc93 |
chore(analytics): stop shipping operational events to PostHog (MUL-2967) (#3720)
* chore(analytics): stop shipping operational events to PostHog (MUL-2967) Operational / execution-lifecycle telemetry dominated PostHog event volume and drove the bill: runtime_offline alone was ~54% of ~22.6M events/mo, and ~99% of events were billed at the higher identified-event rate. These signals already have Prometheus counters (Grafana), so the PostHog copies were redundant cost. - Add analytics.IsMetricsOnly; metrics.RecordEvent now skips the PostHog Capture for runtime_* and autopilot_run_* while still incrementing their Prometheus counter (their analytics.Event constructors are retained to feed the metric label set via IncForEvent). - Remove the agent_task_* PostHog path entirely: drop captureTaskEvent and the AgentTask* constructors/constants. Their Prometheus side is unchanged via the typed BusinessMetrics.RecordTask* methods. Also remove the now-dead taskDurationMS / willRetryTask helpers. - Update the pairing lint test (no agent_task allow-list, no naked-Capture exception), add a RecordEvent skip test + IsMetricsOnly test, and update docs/analytics.md (taxonomy, per-event banners, reconciliation). Product/funnel events (signup, onboarding, issue_created, issue_executed, chat_message_sent, agent_created, autopilot_created, etc.) are unchanged and still ship to PostHog. Co-authored-by: multica-agent <github@multica.ai> * docs(analytics): correct agent_task Prometheus metric contract (MUL-2967) Address PR review: the agent_task_* "Prometheus-only" banner claimed the old PostHog event properties (task_id, agent_id, duration_ms, error_type, will_retry, ...) were the metric label set. They are not — the real labels are only source/runtime_mode/provider/terminal_status/failure_reason. - Replace the agent_task_* sections with the actual metric names and labels (multica_agent_task_*; see business.go / labels.go), and explain that completed/failed/cancelled are terminal_status values on multica_agent_task_terminal_total, with wall-clock in the *_seconds histograms. - Tighten the runtime_*/autopilot_run_* banners so id properties aren't mistaken for labels. - Drop the stale AgentTask allow-list reference from the pairing lint test header comment. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
598a6c51f2 |
refactor(server/lark): collapse HTTP_ENABLED + WS_ENABLED into the SECRET_KEY gate (MUL-2671) (#3717)
MULTICA_LARK_HTTP_ENABLED and MULTICA_LARK_WS_ENABLED were staging knobs from the multi-PR rollout of the Lark MVP — they let the DB schema + inbound dispatcher land before the HTTP wire was real, and before the WS long-conn protocol was wired. Now that the MVP has shipped end-to-end, "I set SECRET_KEY but I don't want to talk to Lark" is not a useful production state: setting the at-rest master key is the operator's opt-in for the integration as a whole. Collapse the gate down to MULTICA_LARK_SECRET_KEY alone. When the key is present, wire the real HTTPAPIClient + the real WSLongConnConnector. CI / integration tests that want stub-style behaviour can point MULTICA_LARK_HTTP_BASE_URL at a mock server (already supported) instead of toggling a separate flag. Host overrides (HTTP_BASE_URL, REGISTRATION_DOMAIN, CALLBACK_BASE_URL) stay — those are real ops needs for international tenants / staging. stubAPIClient + NoopConnectorFactory remain exported because the test suite uses them directly; only the router boot path stops reaching for them. The connector factory keeps its noop fallback for the case where the endpoint fetcher fails to construct, so a malformed MULTICA_LARK_CALLBACK_BASE_URL degrades gracefully (visible as "connector=noop" in the boot log) instead of panicking the server. Lark integration + handler tests still pass; go vet clean. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
d6a556bdbf |
fix(execenv): refresh skills in place on reuse instead of accumulating duplicate dirs (#3716)
Re-dispatching the same agent on the same issue reuses the persistent workdir via execenv.Reuse(), where the standard-provider skill refresh re-wrote skills without clearing the prior dispatch's output, so allocateCollisionFreeSkillDir dodged Multica's own directories into issue-review-multica-N. On reuse, reclaim the platform-owned managed skill directories the prior manifest recorded (removeReusedManagedSkillDirs) and roll back the remaining sidecar files (CleanupSidecars) before refreshing, so each skill lands at its canonical slug every dispatch. Mirrors the Codex hydrateCodexSkills wipe; scoped to reuse, which never runs for local_directory tasks. Fixes #3684 (MUL-2963). |
||
|
|
8c98940b79 |
Lark Bot integration MVP: migration + service boundary (MUL-2671) (#3277)
* feat(db): add Lark integration migration (MUL-2671) Introduces seven tables for the 飞书 Bot integration MVP — per-agent PersonalAgent installations, user/chat bindings, inbound dedup + non-content drop audit, outbound card mapping, and short-lived single-use member binding tokens. Schema notes: - chat_session schema unchanged; Lark routes through a separate binding table rather than adding a metadata JSONB column. - Outbound card mapping is task/message scoped so multiple runs on the same session can't stomp each other's cards. - lark_inbound_audit stores routing / identity / drop_reason ONLY, never message body — the audit channel for unbound users and group messages that don't address the Bot. - app_secret stores ciphertext (encryption helper lands in a follow-up commit on this branch); DB never sees plaintext. Co-authored-by: multica-agent <github@multica.ai> * feat(util): add secretbox AES-256-GCM helper for at-rest secrets First consumer is lark_installation.app_secret (MUL-2671 §4.4), but the helper is intentionally generic — future per-tenant secrets that must not appear in a DB dump can reuse it. Construction: AES-256-GCM with a per-message random nonce, providing authenticated encryption. Tampered ciphertext fails Open instead of silently decrypting to garbage. Master key loaded from a base64 env var via LoadKey; key rotation is not in scope yet. Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): extract IssueService.Create as single create entry (MUL-2671) Establishes the service-layer boundary mandated by Elon's 二审 of MUL-2671 §4.8: issue creation no longer lives inside the HTTP handler. Both the HTTP POST /issues handler and the future Lark /issue command call into service.IssueService.Create, so duplicate guard, issue numbering, attachment linking, broadcast, analytics, and agent/squad enqueue stay aligned. Handler responsibilities shrink to parsing the HTTP request, doing actor resolution / validation (transport-specific), and converting service results into the IssueResponse + 201. The transaction-wrapped core, attachment link, event publish, analytics capture, and agent/squad enqueue all move into service.IssueService.Create. A BroadcastPayload callback on the service keeps the WS broadcast shape (the full IssueResponse) without forcing the service to depend on handler-layer response types. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations): add Lark package skeleton (MUL-2671) Establishes the architectural boundaries Elon's 二审 mandated as first-PR blockers without dragging in OAuth, WebSocket, or card-patching code (those land in follow-up PRs): - ChatSessionService interface — channel-aware chat-session entry point for Lark, deliberately separate from the HTTP SendChatMessage handler. The HTTP handler's single-creator guard (creator_id == request user_id) is correct for the browser client but rejects group chat_sessions by construction; Lark needs its own service. - AuditLogger interface — the only path for recording dropped events. Its signature deliberately omits message body, enforcing the drop-audit policy (MUL-2671 §4.7) at the type level: unbound users and non-addressed group messages can't accidentally end up in chat_session. - Typed IDs (OpenID, ChatID) prevent UUIDs from being conflated with Lark-side identifiers at compile time. - DropReason constants align dashboard/audit queries across callers. Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): move parent/project workspace check into IssueService (MUL-2671) Parent existence and project workspace membership now live inside IssueService.Create, inside the same transaction as the duplicate guard and counter increment. The HTTP handler stops re-implementing the lookup; every future create entry (Lark /issue, MCP, API keys) inherits the same boundary without copy-pasting the SQL. Adds two error sentinels (ErrParentIssueNotFound, ErrProjectNotFound) so transports can translate to their own error shapes. Handler-level cross-workspace tests guard the boundary against future regressions. Co-authored-by: multica-agent <github@multica.ai> * fix(db): harden Lark migration safety底座 — TTL cap + workspace FK (MUL-2671) Two storage-layer hardenings that move the must-fix line off "the app layer enforces it" and onto the schema itself, so future write paths or hand-inserted rows cannot regress the invariants. 1) lark_binding_token TTL cap. The DB CHECK was 1 hour as defense-in-depth while the app constant was 15 minutes; the CHECK now matches the product cap (15 minutes). Application constant docstring updated to reflect that storage enforces the same bound. 2) lark_user_binding workspace membership. The table previously only FK'd to workspace / user / installation independently, so a binding could exist for a user no longer in the workspace, or claim a workspace different from its installation's. Two composite FKs close the gap structurally: * (installation_id, workspace_id) → lark_installation(id, workspace_id) — guarantees a binding's workspace_id always matches its installation's workspace_id. A new UNIQUE (id, workspace_id) on lark_installation is added as the FK target. * (workspace_id, multica_user_id) → member(workspace_id, user_id) with ON DELETE CASCADE — when a user is removed from the workspace, the binding cascades away in the same transaction. There is no longer a path where lark_user_binding outlives workspace membership. These two FKs are the schema-level proof for §4.3's "unbound or non-workspace members cannot leak content into chat_session" invariant. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): inbound services + /issue dispatcher (MUL-2671) Lands the inbound service layer for the Lark Bot MVP, sitting on top of the migration + service-boundary scaffold from the previous commits. What ships: - sqlc queries for all seven lark_* tables (idempotent dedup insert, CAS WS-lease, single-use binding-token consume, etc.) plus GetMostRecentUserChatMessage for the /issue fallback. - AuditLogger backed by lark_inbound_audit; signature deliberately body-free so callers cannot leak content into the drop log. - ChatSessionService: find-or-create chat_session via the binding table (winner-takes-all on the UNIQUE race), append-with-dedup, /issue parser, "previous user message" fallback for bare `/issue` invocation. - Dispatcher orchestrates the inbound pipeline in one place: installation routing → group-mention filter → identity check → ensure session → append+dedup → /issue → enqueue chat task. Group sessions use the installer as creator (stable workspace identity); p2p uses the sender. Agent-offline path falls through with OutcomeAgentOffline so the WS adapter can reply with the offline notice from §4.6. - BindingTokenService: random URL-safe token, SHA-256 stored hash, 15-min TTL pinned at the application AND the DB CHECK; Redeem returns the same opaque error for all rejection cases (no timing oracle on replay). - Unit tests for the parser (13 cases), dispatcher (8 cases via fake Queries/Chat/Audit/IssueCreator/Enqueuer), and binding-token hash/entropy. Real-DB integration tests for OAuth + token redeem land alongside the HTTP handlers in the next commit. Out of scope for this commit (next ones on the same feature branch): OAuth callback, HTTP routes, WebSocket hub, outbound card patcher, frontend. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): installation HTTP surface + secretbox-gated wiring (MUL-2671) Lands the HTTP boundary on top of the inbound services from the previous commit. What ships: - InstallationService.Upsert: the only path that writes lark_installation. Encrypts app_secret with the secretbox passed in at construction time; refuses to fall back to plaintext storage (returns an error from the constructor if no Box is supplied), so a misconfigured dev environment cannot accidentally land a row with cleartext credentials. Revoke flips status without DELETE so audit trail survives. - HTTP handlers under /api/workspaces/{id}/lark/: * GET /installations — member-visible (Integrations tab renders for non-admins). Soft 200 with empty list + configured:false when MULTICA_LARK_SECRET_KEY is unset, so the tab does not error on self-host that has not opted in. * POST /installations — admin-only; 503 when not configured. Re-validates agent_id ∈ workspace before accepting credentials so a cross-workspace agent UUID is rejected. * DELETE /installations/{id} — admin-only; workspace-scoped lookup so one workspace cannot revoke another's installation by UUID guess. - POST /api/lark/binding/redeem (user-scoped, no workspace context): the only path that mints a lark_user_binding row from user action. Redeemer identity comes from the session, not the token, so a stolen link cannot bind an open_id to an attacker's Multica user. The composite FK on lark_user_binding cascades the binding away if the user is not (or no longer) a workspace member, so a non-member who steals the link gets 403 at the DB layer. - Two new event-bus types in protocol.events: EventLarkInstallationCreated, EventLarkInstallationRevoked. - Router wiring: MULTICA_LARK_SECRET_KEY drives a conditional initialization of h.LarkInstallations + h.LarkBindingTokens. When unset, the integration disables itself with an INFO log and the rest of the server boots normally. - Handler tests cover all four not-configured short-circuits. Happy-path integration tests (real DB, full create→list→revoke cycle and token mint→redeem) ship alongside the WS hub PR. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): close binding-token rebind & typed task errors (MUL-2671) Two must-fixes from PR review on HEAD |
||
|
|
1544e3b68a |
feat(skills): built-in agent skills (WIP) MUL-2759 (#3456)
* feat(skills): introduce built-in agent skills (WIP) Inject platform-authored, version-bundled skills into every agent on top of its workspace-bound skills, so agents learn how to operate Multica correctly without users needing to know the internals or agents needing to read source. Mechanism: skills are embedded into the server binary and appended to the agent payload at task-claim time (handler/daemon.go), reusing the existing SkillData wire + daemon-side writeSkillFiles. The daemon needs no changes, and because it travels over an existing wire field, older daemons pick the skills up the moment the server ships. First skill: multica-mentioning — how to build a working @mention (look up the UUID, match type to id source, know what each mention type triggers). WIP: injection mechanism + first skill only; more skills to follow in dependency order (skill -> agent -> squad). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(skills): make multica-mentioning the standard template + add eval Add the contract-skill frontmatter the other built-in skills will copy: user-invocable:false (it triggers from context, not as a slash command) and allowed-tools fencing it to the multica CLI it teaches. These keys survive to agent machines untouched (ensureSkillFrontmatter only ever adds a missing name). Add a Go eval in builtin_skills_test.go (a _test.go so it never ships to agent machines via the skill-files walk): - Enforces the template invariants on every built-in skill, present and future: multica- prefix, name+description present, description within 1024 chars, body within the 500-line L2 budget, no eval file leaking into the shipped payload. - Couples the mentioning skill's documented contract to the real util.ParseMentions: its Incorrect examples must parse to nothing (a name where a UUID belongs fails silently) and its Correct example must fire. A drift in the mention regex now breaks CI instead of silently turning the skill into a lie agents act on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add working-on-issues built-in skill Co-authored-by: multica-agent <github@multica.ai> * feat(skills): verify linked PRs in issue workflow skill Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add skill import and discovery built-ins Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add skill authoring built-in Co-authored-by: multica-agent <github@multica.ai> * docs(skills): align builtin skill workflows Co-authored-by: multica-agent <github@multica.ai> * docs(skills): use structured skill search Co-authored-by: multica-agent <github@multica.ai> * fix(skills): make built-in skill bundle launch-ready Co-authored-by: multica-agent <github@multica.ai> * fix(skills): align built-ins with additive skill binding Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add creating agents built-in skill Co-authored-by: multica-agent <github@multica.ai> * Add built-in squads skill Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): rewrite built-in skills as source-traced contracts Rewrite the built-in agent skills to the inbuilt-skill-authoring standard: state source-traced product facts with the source-code link logic as the core, not prescriptive how-to coaching. - creating-agents: drop the Decision-flow / Do-don't-consequences methodology; replace with field/behavior contracts (validation, persisted shape, daemon claim-time consumption, env gating, skill binding). - skill-discovery: stop teaching repo/github_stars as selection signals — searchClawHubSkills never populates them (always null); rank by install_count + source/url + description. Add file:line citations. - mentioning: drop the unbacked "member mention sends a notification" claim (no such path in the comment handler); state that only agent/squad mentions enqueue work. Tighten the parser-failure wording. - working-on-issues: refresh citations drifted by the main merge; describe the PR response `state` enum accurately; trim status coaching. - skill-importing: correct response type to SkillWithFilesResponse; document the reserved SKILL.md supporting-file rule; add line-accurate citations. - squads: correct the "leader cannot be archived" overstatement (not rejected at create/update; fails closed later at routing/dispatch); refresh source-map attributions and test list. Each skill now ships references/<skill>-source-map.md as its evidence layer (line-accurate citations live there, not pinned in the test, so a future main merge cannot rot them into stale lies). builtin_skills_test.go: replace coaching/line-number pins with drift-resistant contract anchors, forbid the coaching phrasing, and require every skill to ship its source-map. The ParseMentions behavior coupling is preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs(skills): close field-role and citation gaps found in review Independent review of the rewritten built-in skills surfaced two real gaps and some citation drift; this fixes them. - creating-agents: add the three missing field rows (visibility, max_concurrent_tasks, mcp_config) to the field-contract table — mcp_config is runtime-consumed (TaskAgentData, daemon.go), visibility is access-control (default private), max_concurrent_tasks is a scheduler cap (default 6). Mark custom_args/runtime_config JSON validation as CLI-side (the server marshals as-is). Correct the CLI body-builder note (description/instructions use a non-empty check, the rest use Changed). Source-map: fix the env query name (UpdateAgentCustomEnv), the conformance test name, and add the new field defaults + the McpConfig runtime-payload line. - mentioning: the @squad mention private gate is canAccessPrivateAgent, not canEnqueueSquadLeader (that wrapper is the assignment/child-done path). - working-on-issues: cite notifyParentOfChildDone at its func def (:51), not the doc comment (:15). - skill-importing: config.origin is set only when the source supplied an origin — note it may be absent; cite createSkillWithFiles at its definition (skill_create.go:72), not the call site. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * Add built-in skills for autopilots runtimes and resources Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): list skill descriptions in the brief Skills index The brief's `## Skills` section emitted bare skill names only, discarding the one-line description that SkillContextForEnv already carries. For Claude-family providers the frontmatter description is loaded natively; for providers without native skill discovery (hermes/default) the brief's list is the only signal they ever see, so a bare name gave them nothing to decide when to load a skill. Emit `name — description` when a description is present, falling back to the bare name when it is empty. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): drop CLI-only rule from working-on-issues The "Platform data goes through the CLI" section duplicated the runtime brief's `## Important: Always Use the multica CLI` section verbatim (and the attachment-via-CLI note duplicated the brief's `## Attachments`). The CLI-only rule is universal and must be known before any skill loads, so the brief is its single source of truth; the skill copy was pure redundancy and a drift risk. Remove it and the matching intro clause. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): remove discovery guidance from built-ins * docs(skills): remove stale skill-necessity records The per-skill necessity records had drifted to 3 of 8 shipped skills plus a record for `multica-skill-authoring`, which is not a shipped built-in skill. Per-skill "why it exists / when to use it" already lives co-located with each skill (frontmatter `description` + `references/<skill>-source-map.md`) where it cannot drift from the skill, and the doc's methodology duplicated the workspace's inbuilt-skill-authoring protocol. Remove the file rather than keep a parallel listing that every new skill has to remember to update. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): add source-authority escape hatch to the brief The brief already tells agents to run `--help` for command discovery, but nothing stated the trust precedence when a skill, the brief, or a doc seems to contradict actual behavior. Add one line to the Available Commands escape-hatch note: trust the live CLI (`--help`/`--output json`) and the checked-out source over source-traced prose that can lag the code, and verify on any conflict or confusion. Kept in the always-on brief (universal, needed before any skill loads) rather than duplicated into each skill; per-skill source-map pointers remain the specific layer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): scope the source-authority escape hatch to the CLI The previous version told agents the "checked-out source is the deeper authority" for verifying behavior. That over-claims: the repos in a task's brief come from GetWorkspaceRepos + project github_repo resources (per-workspace config, see daemon.registerTaskRepos), not the Multica platform source. A generic agent's checked-out source is its own app, not Multica's code, so it cannot verify a Multica skill/brief claim against it. The only universally available authority for Multica behavior is the live CLI (`--help` / `--output json` / observed command behavior). Re-scope the line accordingly and state plainly that the platform's source is not in the workdir. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * revert(runtime): drop the source-authority escape-hatch line Reverts the brief addition from |
||
|
|
9c9afd4a66 |
feat(metrics): BusinessSamplerCollector for active users / queued / runtime gauges (MUL-2947) (#3706)
* feat(metrics): scrape-time BusinessSamplerCollector for active users / queued / runtime gauges (MUL-2947)
Adds an opt-in prometheus.Collector that runs a fixed set of read-only
SQL queries on every /metrics scrape and exposes the results as gauges:
- multica_active_users{window=5m|1h|24h}
- multica_active_workspaces{window=...}
- multica_agent_task_queued{source}
- multica_agent_task_running{source,runtime_mode}
- multica_agent_task_stuck_total{source}
- multica_runtime_online{runtime_mode,provider}
- multica_runtime_heartbeat_age_seconds{runtime_mode} (histogram)
- multica_workspace_total
Plus a self-introspection histogram
multica_business_sampler_query_seconds{name=...} and a counter
multica_business_sampler_query_errors_total{name=...} so the sampler's
own behaviour is observable on /metrics.
Production-safety contract per the PR4 brief:
- every query runs in its own BEGIN READ ONLY tx with
SET LOCAL statement_timeout = '500ms' (configurable)
- the sampler takes a dedicated *pgxpool.Pool option so operators
can isolate it from business traffic
- successful results are cached for 5–10s (default 8s) to absorb
concurrent scrapes from multiple Prometheus replicas
- every SQL has a hard LIMIT 100 fallback
- all label values flow through the existing BusinessMetrics
NormalizeTaskSource / NormalizeRuntimeMode / NormalizeRuntimeProvider
whitelists, so a misbehaving runtime cannot inflate cardinality
- sampler is OPT-IN via RegistryOptions.BusinessSampler — existing
callers that only pass Pool keep their current behaviour and never
start hitting the DB on /metrics
Tests cover: emit shape, TTL cache (one DB call per N scrapes),
bounded cardinality under malicious labels, opt-out (no leakage), and
DB-hang isolation (unreachable host -> /metrics returns within 5s,
query_errors_total advances).
Refs MUL-2947 (depends on PR2 / MUL-2948, merged in #3695).
Co-authored-by: multica-agent <github@multica.ai>
* fix(metrics): address PR4 review — wire sampler in main.go, fix LIMIT bug, add live-DB statement_timeout test
Three fixes from 大彪's review on #3706:
1. main.go was building NewRegistry without the BusinessSampler option,
so the collector was effectively dead code in prod. Now constructs a
dedicated 2-conn pgxpool (newSamplerDBPool) from the same DATABASE_URL
when METRICS_ADDR is set, plumbs it into RegistryOptions.BusinessSampler,
and defers Close() at shutdown. A pool-build failure logs and disables
the sampler instead of taking down the server.
2. queryActiveUsers / queryActiveWorkspaces previously wrapped the
distinct-user/workspace subquery in a 'LIMIT 100', then COUNT(*)'d
the result — capping the active-user gauge at 100 regardless of
reality. Removed the inner LIMIT; the COUNT scalar is one row anyway,
and metric cardinality is bounded by the fixed samplerWindows
allow-list, not by the SQL shape.
3. The previous DB-hang test only exercised the acquire-fails path. Added
business_sampler_pgsleep_test.go which connects to a live Postgres
(skips cleanly when DATABASE_URL is not set), runs SELECT pg_sleep(2)
inside a sampler-style tx with SET LOCAL statement_timeout = '500ms',
and asserts:
- the call returns in well under 1.5 s (proving the server-side
cancellation, not just our caller-side context)
- query_errors_total{name=pg_sleep_canary} advances
- the duration histogram records the cancellation
Verified locally: 550 ms, SQLSTATE 57014 'canceling statement due to
statement timeout' — exactly the safety net the PR claims.
Refs MUL-2947 / PR #3706.
Co-authored-by: multica-agent <github@multica.ai>
* test(metrics): assert SQLSTATE 57014 on pg_sleep cancellation
The previous assertion only checked that the query was cut off in well
under the sleep duration, which a caller-side context cancellation
would also satisfy. Capturing the inner pgconn.PgError and asserting
Code == "57014" ("query_canceled") nails down that Postgres itself
cancelled the statement because of the SET LOCAL statement_timeout —
so a regression that drops the SET LOCAL line fails this test loudly
instead of silently passing on context cancellation.
Refs MUL-2947 / PR #3706 review nit.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
|
||
|
|
de900b2ba6 |
feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949) (#3698)
* feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949) PR3 of the Grafana board metrics split (parent MUL-2328). Adds 23 new Prometheus counter/histogram families to the PR2 BusinessMetrics collector covering the activation/community/commercial funnels, and binds every PostHog event emission to a matching metric increment so the two sides cannot drift. Funnel: signup, workspace_created, team_invite_sent/accepted, onboarding_*, cloud_waitlist_joined. Content: issue_created, chat_message_sent, agent_created, squad_created, autopilot_created, issue_executed. Runtime: runtime_registered/ready/failed/offline + ready_seconds histogram, daemon_ws_message_received_total. Autopilot: autopilot_run_started/terminal/skipped. Webhook/GitHub: webhook_delivery_total, github_event_received_total, github_pr_review_total, github_pr_merge_seconds histogram. CloudRuntime: cloudruntime_request_total + duration histogram, wired through a small RequestRecorder interface so the cloudruntime package stays decoupled from metrics. Commercial: feedback_submitted, contact_sales_submitted. The pairing helper metrics.RecordEvent(client, m, ev) emits the PostHog event AND increments the matching counter via IncForEvent dispatch, reading labels from the analytics event Properties. Every existing h.Analytics.Capture(analytics.X(...)) call site has been migrated to the helper across handler/, service/, and cmd/server/runtime_sweeper.go. Lint enforcement (server/internal/metrics/business_pairing_test.go): - TestEveryAnalyticsEventHasPrometheusCounter: every Event* constant in analytics/events.go either dispatches via IncForEvent or is in the taskMetricEvents allow-list (PR2 typed RecordTask* methods). - TestNoNakedAnalyticsCaptureInHandlersOrServices: AST-walks handler/ service/cmd-server for direct Analytics.Capture(...) calls — only service/task.go's captureTaskEvent helper is allow-listed. - TestEveryAnalyticsRecordEventTakesAnalyticsHelper: validates the third arg of every metrics.RecordEvent call is built from analytics.*. Cardinality protection: all new label values pass through fixed allow-lists in labels_pr3.go; unknown values collapse to 'other'/'unknown'/'error'. Refs: - Spec MUL-2328 / MUL-2949. - Builds on PR2 (MUL-2948) — collectors registered through the same BusinessMetrics struct, no separate Registry. - Uses PR1's taskfailure.Reason (MUL-2946) for runtime_failed's failure_reason label via NormalizeFailureReason. Out of scope: Sampler-class metrics (PR4 / MUL-2947), pr_review_total emission point (no review event handler exists yet — counter is defined, TODO to wire up when /api/webhooks/github grows pull_request_review handling). Co-authored-by: multica-agent <github@multica.ai> * fix(server): tighten PR3 review items — signup_source bucket, fill platform/kind/form_source enums, onboarding_started server emission, lint scope (MUL-2949) Addresses 张大彪's review on #3698: 1. signup_source: NormalizeSignupSource added to labels_pr3.go with a fixed allow-list bucket (direct/google/twitter/linkedin/.../other). Parses JSON cookie payload for utm_source/source/referrer fields, strips URL schemes, maps well-known hostnames to channel buckets. PostHog event still ships the raw cookie value for analytics; only the Prometheus label is bucketed. 2. Filled the unknown/other label gaps: - analytics.IssueCreated and analytics.ChatMessageSent now take a platform parameter sourced from middleware.ClientMetadataFromContext (X-Client-Platform header) at the handler. Autopilot-originated issues stamp PlatformServer. - analytics.FeedbackSubmitted now takes a kind parameter; CreateFeedback reads req.Kind (default "general") so the picker selection lights up the metric's kind label instead of long-term "other". - analytics.ContactSalesSubmitted now takes a formSource (page / onboarding / agents_page); CreateContactSales reads req.Source. The metric reads ev.Properties["form_source"] so the analytics CoreProperties.Source ("marketing_contact_sales") stays backward-compat for PostHog dashboards. 3. analytics.OnboardingStarted helper added; server-side emission lives in PatchOnboarding, fired exactly once per user on the first PATCH that carries a non-empty questionnaire payload (firstTouch logic compares prior bytes against {} / null). Frontend onboarding_started keeps firing on page open; the server emission is what guarantees the Prometheus counter exists so Grafana can be cross-checked against the PostHog funnel without depending on the SDK roundtrip. 4. business_pairing_test.go tightened: - TestNoNakedAnalyticsCaptureInHandlersOrServices now allow-lists at function granularity (just captureTaskEvent in service/task.go), not whole-file. Any future naked Capture in the same file fails CI. - TestEveryAnalyticsRecordEventTakesAnalyticsHelper now does def-use tracking inside the enclosing FuncDecl: when RecordEvent's third arg is an *ast.Ident, the test walks the function body for the assignment that defined it and confirms the RHS is an analytics.<Helper>(...) call. Bare local idents that didn't originate from analytics are now caught. 5. gofmt -w applied across the touched files; gofmt -l clean. Tests: go test ./internal/metrics/... ./internal/analytics/... pass. Pre-existing TestClaimTask_/TestWebhook_MergedPR/TestDeleteIssueByIdentifier failures on origin/main are DB-environment-dependent and not regressions from this change. Co-authored-by: multica-agent <github@multica.ai> * fix(server): normalise onboarding_started platform label + regression test (MUL-2949) Addresses 张大彪's last review nit: - IncForEvent's EventOnboardingStarted case now wraps the platform property with NormalizePlatform, matching every other platform-bearing metric. A misbehaving frontend can no longer leak a raw X-Client-Platform header value into the multica_onboarding_started_total{platform=...} series. - New labels_pr3_test.go covers every PR3 normalizer with both a happy-path value and an unknown value, asserting the unknown collapses to the documented fallback bucket. Includes a focused regression for onboarding_started: emits one event with an attacker-shaped platform string and asserts the metric only exposes web + unknown label values (no raw header bleed). - testutil.go gains a small GatherForTest helper so the regression test can pull the typed MetricFamily map without re-implementing the registry-walk dance. Co-authored-by: multica-agent <github@multica.ai> * fix(server): NormalizeTaskSource on workspace_created + document lint limitations (MUL-2949) Final review touch-ups before merge: - IncForEvent's EventWorkspaceCreated case wraps source through NormalizeTaskSource, matching the other source-bearing dispatches (issue_created, agent_created, issue_executed). Closes the last raw property leak in the dispatcher table. - business_pairing_test.go inline docstrings now spell out the two known limitations of the lint gate that 张大彪 / Eve flagged: analyticsBackedIdents matches by ident NAME (not SSA def-use, so a nested-scope shadow could pass) and isMetricsRecordEvent hard-codes the import alias set. PR description carries a Follow-ups section with the same two items so the work is visible after merge. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: 魏和尚 <agent+wei@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
5900d8b637 |
fix(issues): make start_date/due_date timezone-stable calendar days (#3618) (#3692)
* fix(issues): store start_date/due_date as DATE, not timestamp (MUL-2925) These fields are calendar days (the pickers offer no time-of-day), but were stored as TIMESTAMPTZ. A client serializing local midnight via toISOString() folded its timezone into the instant, so the day shifted by the local offset (GH #3618). Migrate the columns to DATE and parse/serialize date-only "YYYY-MM-DD". ParseCalendarDate still accepts legacy RFC3339 (truncated to the UTC day) so older clients keep working. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): render start_date/due_date as timezone-stable calendar days (MUL-2925) Pickers now emit date-only "YYYY-MM-DD" (local calendar day) instead of toISOString(), and every read formats via the shared @multica/core/issues/date helpers with timeZone:"UTC" so the day never shifts with the viewer's offset. The Gantt's existing UTC bucketing is now correct. Covers web/desktop pickers, quick-set menu, list/board/detail/activity, and the mobile due-date picker. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): address date-only review — loud-fail ambiguous dates, finish display sweep (MUL-2925) Review follow-ups on #3692: - ParseCalendarDate no longer silently truncates a legacy non-midnight RFC3339 to the wrong UTC day; it accepts only YYYY-MM-DD or an exact UTC-midnight instant and rejects ambiguous ones loudly. Adds util unit tests. - migration 112 pins the TIMESTAMPTZ->DATE conversion to UTC explicitly via AT TIME ZONE 'UTC' (was session-timezone dependent); down migration too. - Convert remaining date-change display sites to formatDateOnly: inbox detail label (web) and mobile activity + inbox labels (were new Date()+local format). - CLI --start-date/--due-date help now says YYYY-MM-DD, not RFC3339. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
a72fb020de |
Add business metrics collectors (#3695)
Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
10afd1af1b |
feat(server): introduce pkg/taskfailure classifier and switch in-flight failure_reason writes (MUL-2946) (#3693)
Lift MUL-1949's offline backfill failure_reason taxonomy into a shared in-flight classifier so the agent_task_queue.failure_reason column is written with refined values (provider_auth_or_access, context_overflow, provider_capacity_or_rate_limit, …) at write time rather than waiting on SQL backfill to re-classify after the fact. PR1 of the Grafana board plan in MUL-2328 — the upcoming PR2 reuses pkg/taskfailure.AllReasons() to pre-warm the Prometheus failure_reason label set. * server/pkg/taskfailure: new package with the canonical 21 Reason constants (7 platform-side + 14 agent_error.* sub-reasons), AllReasons() returning a defensive copy, IsAgentError() prefix check, and Classify(rawError) Reason mirroring the SQL CASE rules from MUL-1949 (db-boy's analysis). 100% statement coverage. * server/internal/daemon/daemon.go: route the 'agent_error' coarse fallback paths (StartTask error, runTask early-return error, CompleteTask permanent rejection, reportTaskResult default branch) and the executeAndDrain default error case (chained after classifyPoisonedError) through taskfailure.Classify so blocked / timeout / unknown-status results all carry a refined reason on the wire. * server/internal/service/task.go: FailTask classifies errMsg when the daemon-supplied failureReason is empty, eliminating the legacy COALESCE(.., 'agent_error') landing. * server/internal/daemon/poisoned.go: alias FailureReasonIterationLimit and FailureReasonAPIInvalidRequest to the canonical taskfailure constants. agent_fallback_message and codex_semantic_inactivity are pre-existing operational reasons not in the canonical 21 — kept as literals for now and revisited in a follow-up PR. Backfill SQL from MUL-1949 stays as the authoritative offline source of truth; this PR keeps the in-flight classifier in lock-step with the SQL CASE expression so historical and future rows share the same taxonomy. No behavior change for the platform-side reasons (queued_expired, runtime_offline, runtime_recovery, timeout, etc.) which already align with the canonical set. Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
f2f17e3355 |
Optimize chat message loading (#3685)
* Optimize chat message loading Co-authored-by: multica-agent <github@multica.ai> * Fix chat history cursor pagination Co-authored-by: multica-agent <github@multica.ai> * Fix chat session list remount key Co-authored-by: multica-agent <github@multica.ai> * fix(chat): fall back to legacy /messages when paged endpoint 404s Deployment-order compatibility: a backend deployed before the /messages/page endpoint existed returns 404 for the unknown route. The cursorless initial page now falls back to the legacy full-list /messages endpoint and wraps it in a single has_more:false page, so chat never white-screens regardless of which side deploys first. A 404 on a cursor request still propagates to avoid duplicating the full list. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
44feb3d06d |
fix(skill): canonicalize reserved SKILL.md path check across daemon + API (#3660)
A skill_file row whose path is the skill's own SKILL.md (persisted by older builds or direct create/update API calls) collides with the primary content the daemon writes itself, failing task prep with errPathPreExists on every non-codex local runtime (#3489). #3526 guarded this with strings.EqualFold(path, "SKILL.md") at the daemon write site and the three API ingress points, but the stored path is not canonicalized: "./SKILL.md" or "sub/../SKILL.md" slip past the exact-match guard while filepath.Join still resolves them onto the same SKILL.md, so prep can still break. Extract one canonical helper, skill.IsReservedContentPath, that cleans the path before the case-insensitive compare, and use it at all four sites (execenv writeSkillFiles, skill create, update, single-file upsert). Add a daemon-side regression test for writeSkillFiles ignoring a bundled SKILL.md (exact + "./" spellings) — the load-bearing fix previously had only API-layer coverage — plus a unit test for the helper. Existing poisoned rows are intentionally left in place (skipped at prep) per the decision on MUL-2928. MUL-2928 Follow-up to #3526; supersedes #3560. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
996eb07dc5 |
fix(daemon): skip duplicate SKILL.md in supporting files to prevent task prep failures (#3526)
Fixes #3489 MUL-2928 |
||
|
|
888186b183 |
fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904) (#3654)
* fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904) Agents inlining a backtick-wrapped token into `multica issue comment add --content "..."` had the shell run it as a command substitution, silently deleting the token; the stored comment never matched the model's intent, so it retried forever — spamming OKK-497 with duplicate comments. The corruption is shell-driven, not provider-driven, so extend the "never inline --content; use --content-file / quoted-HEREDOC --content-stdin" rule from Codex-only to ALL providers: - BuildCommentReplyInstructions: collapse the Linux/macOS non-Codex inline branch into the unified quoted-HEREDOC stdin template. - buildMetaSkillContent: rename "Codex-Specific Comment Formatting" -> "Comment Formatting" and emit it for every provider; strengthen the Available Commands entry and the assignment step-6 examples to steer away from inline --content. - Windows behavior unchanged (file-only; avoids PowerShell ASCII drop). Tests: flip the non-Codex Linux reply test into a MUL-2904 regression, broaden the stdin-emphasis test across providers, and pin the provider-agnostic guardrail. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): keep Windows assignment brief file-only (address review) Review catch on #3654: the previous commit added platform-agnostic prose recommending "--content-file or --content-stdin" in the Available Commands entry and the assignment-triggered step-6 example. The assignment path has no BuildCommentReplyInstructions OS override, so on Windows an agent following step 6 literally would pipe its final comment through PowerShell and drop non-ASCII bytes (#2198 / #2236 / #2376) — contradicting this PR's own Windows file-only rule in the ## Comment Formatting section. Make the platform-agnostic surfaces defer to the OS-aware ## Comment Formatting section (the single source of truth) instead of naming stdin. The flag synopsis still lists all three modes. Add TestInjectRuntimeConfigWindowsAssignmentBriefStaysFileOnly: a Windows assignment-triggered brief must not contain any prescriptive "... or --content-stdin" recommendation. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
580ad5b492 |
fix(issues): validate and clamp limit/offset in ListIssues (MUL-2847) (#3585)
* fix(issues): validate and clamp limit/offset in ListIssues (MUL-2847)
ListIssues parsed the limit and offset query params but never validated
them, so:
- GET /api/issues?limit=-1 -> HTTP 500 (Postgres rejects negative
LIMIT with SQLSTATE 2201W)
- GET /api/issues?limit=100000000 -> unbounded read in a single
response
- GET /api/issues?offset=-1 -> same 500
SearchIssues and ListGroupedIssues already apply v > 0 + an upper clamp
on limit and v >= 0 on offset. This brings ListIssues to the same
pattern: ignore non-positive limit (keep default 100), clamp to 100,
ignore negative offset (keep default 0). default == clamp == 100 keeps
existing callers' behavior identical and matches the upstream issue
suggestion.
TestListIssues_LimitValidation seeds 3 issues in a dedicated project
and pins the nine boundary cases (negative/zero/huge/non-numeric
limit, negative/non-numeric offset, the clamp boundary, and explicit
small/positive-offset sanity) plus two sanity checks that an explicit
small limit and a positive offset are honored.
Fixes MUL-2847 / upstream multica-ai/multica#3563.
Co-authored-by: multica-agent <github@multica.ai>
* test(issues): strengthen LimitClamp test and fix comments (MUL-2847)
Address review feedback from @Lambda and @Emacs on PR #3585:
1. The 3-row set in TestListIssues_LimitValidation can't distinguish
'clamp fired' from 'clamp missing': with only 3 rows, limit=100000000
returns 3 rows whether or not the clamp exists. Split the clamp
behavior into a new TestListIssues_LimitClamp that seeds 101 issues
and asserts len(issues) == 100 for limit=100/101/200/100000000, plus
limit=50 honored below the clamp. Without the clamp line, the
huge/above-clamp subtests would fail with len == 101.
2. Fix the misleading comment that claimed 'limit=0 -> same 500'.
Postgres LIMIT 0 is valid SQL and returns zero rows. The guard
exists for sibling-consistency (SearchIssues / ListGroupedIssues
already treat v <= 0 as 'use default'), not to avoid a 500. Move
the limit=0 case out of TestListIssues_LimitValidation since it's
not 500-related; TestListIssues_LimitClamp's 'no limit returns
default page of 100' subtest pins the default behavior anyway.
3. Add a subtest that pins the offset+clamp composition
(limit=200&offset=50 against 101 rows = 51 rows), proving the
clamp caps the page size while offset still indexes the full
result set.
4. Fix gofmt: the original file's leading-bullet comment indentation
was off by two spaces; gofmt -l now reports clean.
All 14 subtests across both functions pass; full ./internal/handler/
suite still passes (3.2s).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
|
||
|
|
f539fdba83 |
feat(onboarding): backfill prompt for missing source attribution (MUL-2796) (#3550)
* feat(onboarding): backfill prompt for users missing source attribution Adds a one-shot popup shown after login to already-onboarded users whose `onboarding_questionnaire.source` was never recorded — either they completed onboarding before the source step shipped, or they clicked Skip on it. Reuses the existing 12-option StepSource UI and the existing `PATCH /api/me/onboarding` endpoint, so no schema or backend changes. Web renders it as a route at /onboarding/source (sibling of the reserved /onboarding); desktop dispatches it as a WindowOverlay per the Route categories rule. Submit and explicit Skip are terminal; the close X bumps a per-user localStorage counter and stops appearing after 3 dismissals. Emits source_backfill_shown / submitted / skipped / dismissed PostHog events so the funnel can be tracked separately from first-time onboarding. For MUL-2796. Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): preserve role/use_case and respect dismiss cap in source backfill Round-2 fixes from Emacs's review of #3550: 1. PATCH wipe: `PATCH /api/me/onboarding` replaces the JSONB column wholesale (server/internal/handler/onboarding.go), so sending only the source slots was wiping role/use_case/version for exactly the historical users this targets. Read user.onboarding_questionnaire, overlay the source fields client-side via mergedQuestionnairePatch, and send the full shape. 7 unit cases cover the merge semantics. 2. Legacy single-string source: pre-multi-select rows wrote `source: "search"` as a bare string. needsSourceBackfill now treats that as already answered, matching mergeQuestionnaire (views) and stringOrSlice.UnmarshalJSON (server). Flipped the existing test and added empty-string + null coverage. 3. Dismiss cap honored in callback: the web auth callback was passing dismissCount=0, which would force-route capped users through /onboarding/source on every login (the route page would bounce them onward, but only after a blank detour and a re-fired `source_backfill_shown` event). Added readSourceBackfillDismissCount so the callback reads the same per-user localStorage bucket the prompt writes to. Test asserts a count of 3 bypasses the detour. Co-authored-by: multica-agent <github@multica.ai> * test(onboarding): clear source-backfill dismiss counter in callback test beforeEach Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): footer hint text matches the Submit button on the backfill prompt The Source step's hint reads "Hit Continue when you're ready" because its commit button is "Continue". The backfill view ships a "Submit" button instead, so the inherited hint was misleading. Add a dedicated `source_backfill.hint_ready` key across en / zh / ko and use it here. Caught during browser E2E in the round-2 verification stack. Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): magic-code login also detours through source backfill The round-2 fix in PR #3550 only wired the source-backfill detour into the OAuth `/auth/callback` post-success path. Magic-code login goes through `/login` → `handleSuccess()` which calls `resolveLoggedInDestination()` and pushes directly to the workspace, so those users never reach `/onboarding/source`. Caught during the local-env demo for Jiayuan. Add `maybeSourceBackfillDetour` to the login page and apply it in both the already-authenticated useEffect and the post-verify-code handler. Predicate consults the same per-user localStorage bucket the prompt writes to, so a user who hit the close-X cap on this browser flows straight through. Co-authored-by: multica-agent <github@multica.ai> * refactor(onboarding): source backfill is a workspace-mounted modal, not a route detour Per UAT, the prompt should overlay the workspace as a Dialog with the workspace visible behind a dimmed backdrop — the original brief and reference screenshot both showed a modal. PR #3550 shipped a full-window takeover (web /onboarding/source + desktop WindowOverlay) which Jiayuan rejected. This commit replaces the full-window view with a Dialog-based `<SourceBackfillModal />` mounted once inside the shared `DashboardLayout` (packages/views/layout). The modal self-mounts: it reads `needsSourceBackfill(user, dismissCount)` and opens itself when the predicate flips to true; X / ESC / outside-click all bump the per-user localStorage cap and close. Removed: - apps/web/app/(auth)/onboarding/source/page.tsx (route) - paths.sourceBackfill (no longer needed) - callback page detour - login page maybeSourceBackfillDetour - desktop WindowOverlay type "source-backfill" - desktop navigation interception of /onboarding/source - desktop App.tsx dispatch effect - pageview-tracker case - views/onboarding `SourceBackfillView` + `readSourceBackfillDismissCount` exports Preserved (semantics unchanged): - `needsSourceBackfill` predicate (incl. legacy single-string source coercion) - `mergedQuestionnairePatch` so role / use_case survive Submit / Skip - PostHog events: source_backfill_shown / submitted / skipped / dismissed - Per-user dismiss-count cap (3) in localStorage - en / zh / ko i18n strings Tests: - 7 new tests for the modal in packages/views/onboarding/source-backfill-modal.test.tsx - Adjusted apps/web/app/auth/callback/page.test.tsx: detour tests dropped, one assertion remains that onboarded users with missing source land in the workspace (the modal handles the rest) - Full suite: 965 tests pass, typecheck + lint clean Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): mount source-backfill modal on the desktop workspace too Desktop's WorkspaceRouteLayout never wraps DashboardLayout, so the previous commit's modal mount only fired for web. Regression: desktop users were not seeing the prompt at all. Wire the same `<SourceBackfillModal />` next to `<WelcomeAfterOnboarding />` inside `workspace-route-layout.tsx`, with the matching `!overlayActive` suppression so the Dialog doesn't portal-jump above an active pre-workspace WindowOverlay (onboarding / accept-invite / new-workspace). Same component on both platforms — single source of truth lives in packages/views/onboarding/source-backfill-modal.tsx. Also drop the now-stale `source-backfill detour` comment in the web callback test fixture (Emacs nit, non-blocking). Co-authored-by: multica-agent <github@multica.ai> * test(desktop): assert workspace-route-layout mounts source-backfill modal Two structural tests pinning the round-4 fix: - `mounts SourceBackfillModal when no WindowOverlay is active` — guards against the regression Emacs caught (modal silently absent on desktop because the previous round only wired DashboardLayout). - `suppresses SourceBackfillModal while a WindowOverlay is active` — mirrors the existing `!overlayActive` rule that WelcomeAfterOnboarding already relies on so a portal-rendered Dialog can't visually outrank an active pre-workspace overlay. Mocks the SourceBackfillModal with a marker component so the test asserts mount/unmount without depending on the modal's own predicate gate. Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): backfill modal Other toggles off; entrance settles after 700ms UAT round-3 follow-ups from Jiayuan: 1. **Other can't be deselected**: the modal kept a parallel `pendingOther` flag set to true on every Other click, and `IconOtherOptionCard`'s row click was guarded with `if (!selected) onSelect()` — so a second click neither flipped pendingOther nor reached the parent toggle. Drop `pendingOther` (the `source.includes("other")` derivation is already authoritative) AND add an opt-in `allowToggleOff` prop to `IconOtherOptionCard` that lets the row toggle when already selected. The text input stops click propagation so typing never deselects. 2. **Rebase + absorb GitHub channel**: rebased onto origin/main which added `social_github` (PR #3612). Modal's option list now mirrors StepSource — GitHub slotted between YouTube and Other social, reusing the existing `GitHubIcon`. 3. **Soft entrance**: defer the dialog open by 700ms after the user lands on a workspace so the underlying view paints first and the modal feels like an inviting prompt rather than a hard block. Honour `prefers-reduced-motion: reduce` (open immediately for users who have opted out of incidental motion). Tests: - New `Other toggles off on the second click instead of getting stuck` - New `renders the GitHub channel rebased from origin/main` - New `defers the entrance by ~700ms when the user has not opted into reduced motion` - Existing tests stamp `prefers-reduced-motion: reduce` in beforeEach so the dialog opens synchronously and they don't need to drive fake timers. Full suite passes (969 tests). Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): backfill modal opens reliably + Other deselects via icon area Three follow-up fixes after live UAT: 1. Strict-mode regression on entrance delay: the gate ref was being stamped when the effect *scheduled* the timer, so React Strict Mode's double-invoke cleared the first timer and then bailed on the second pass because the ref was already set, leaving the dialog forever closed. Stamp the ref only inside the timer callback (or synchronously when reduced-motion is on) so the second strict pass starts a fresh timer. 2. Other deselect: dropping `pendingOther` wasn't enough — the input that replaces the label when Other is selected was previously stopping click propagation, so a re-click on the row never reached the toggle. Remove `e.stopPropagation()` and instead let the row's onClick ignore clicks whose target IS the input (typing / focusing the input still doesn't deselect; clicks on the icon, padding, or border do). 3. Tests: drive the Other re-click via Playwright `click({position: {x:24,y:24}})` so the click lands on the icon area instead of the center of the input, matching real-user behaviour. Co-authored-by: multica-agent <github@multica.ai> * refactor(onboarding): source picker is single-select primary source Per Jiayuan's call after the survey of HDYHAU UX in PLG SaaS (Linear / Vercel / Loom / Notion / Webflow / Stripe / Figma / Cursor / PostHog mostly skip the question entirely; where it's asked the documented default — Fairing / Recast / HockeyStack / Ruler Analytics — is to capture the primary source so channel weights sum to 100% and ROI math is defensible). Modal + StepSource both pivot from multi-select to single-select radio. Server schema is intentionally untouched: `source` stays `string[]` for back-compat with v2 multi-select rows; the client always sends a one-element array. Zero migration, zero data loss. Frontend: - `source-backfill-modal.tsx`: state pivots from a multi-element `source: Source[]` to a single `pickedSlug` derived from `source[0]`; click handler replaces the array instead of toggling. Cards switch to `mode="radio"`, the fieldset gets `role="radiogroup"`, the now-redundant `pendingOther` and `allowToggleOff` opt-in go away — radio mode means no toggle-off, so the original UAT bug ("Other can't be deselected") is structurally impossible. - `step-source.tsx`: drop the `multiSelect` prop so it routes through `step-question.tsx`'s existing radio path (same one StepRole already uses). Picking a second option replaces the first; switching away from Other clears `source_other` so a stale value can't leak. - `icon-option-card.tsx`: revert the `allowToggleOff` plumbing. Tests: - `source-backfill-modal.test.tsx`: drop the multi-select toggle-off assertion; add "picking a second option replaces the first" with explicit radio-role queries. - `step-source.test.tsx`: rewrite multi-select tests as single-select (no more "stacks several picks" / "toggle off" cases); add "switching away from Other clears source_other". Full suite (970 tests) green, typecheck + lint clean. Co-authored-by: multica-agent <github@multica.ai> * docs(onboarding): refresh stale multi-select comments around source Comment-only follow-up to the single-select refactor in d14f9d09f. Five docblocks still described `source` as multi-select; they now correctly say single-select and explain the array shape is kept purely for v2 back-compat with the JSONB column. - packages/core/onboarding/types.ts — QuestionnaireAnswers docblock - packages/core/onboarding/store.ts — PostHog mirror comment - packages/views/onboarding/steps/step-question.tsx — header docblock, canContinue branch, and footer-hint comment (Source moves from the multi-select side to the single-select side; Use case stays as the remaining multi-select consumer) - server/internal/handler/onboarding.go — questionnaireAnswers docblock and the stringOrSlice fall-back comment (the column "going multi- select" is no longer the current state; rename to "pre-array shape") - server/internal/analytics/events.go — OnboardingQuestionnaireSubmitted docblock No behaviour changes. Tests + Go build still green. Co-authored-by: multica-agent <github@multica.ai> * i18n(onboarding): add ja translations for source-backfill keys The Japanese locale landed on main (PR #3538) after this branch started, so my source-backfill round-2 keys (`common.close`, `source_backfill.eyebrow / lede / submit / hint_ready`) never made it into ja and the parity test fails in CI. Add them now with translations that match the en/zh-Hans/ko wording and tone. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
91c1e51411 |
feat(editor): add / slash-command palette for invoking agent skills (#3159)
* feat(editor): add / slash-command palette for invoking agent skills Adds a `/` trigger in the chat box that opens a popover listing the active agent's skills. Selecting an item inserts a `[/label](slash://skill/<id>)` token; the daemon extracts those IDs in `buildChatPrompt` and emits an "Explicitly selected skills:" block using the canonical names from the agent's skill registry — labels are display-only and never trusted. Built on Tiptap's `Mention` extension so the suggestion lifecycle, keyboard routing, and IME handling mirror the existing `@` mention UX. Item list is sourced from the React Query workspace cache (no per-keystroke fetch). Gated behind a new `enableSlashCommands` prop so only `chat-input` opts in; other `ContentEditor` consumers (issue editor, comments) are unaffected. Read-only markdown surfaces render the token as a `.slash-command` pill via a custom link renderer + sanitize-schema/url-transform allowlists. Closes #3108 * fix(i18n): add slash_command editor copy for ko/ja The PR added slash_command popover empty-state keys to en + zh-Hans only; locales/parity.test.ts requires every locale to cover every EN key, so ko and ja failed CI. Add the two keys (no_skills_configured, no_results) matching existing skill terminology (스킬 / スキル). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Naiyuan Qing <145280634+NevilleQingNY@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
2348301d2b |
fix: gate private squad leader bypass (MUL-2860) (#3648)
* fix: gate private squad leader from being triggered by unauthorized members Add canEnqueueSquadLeader helper that checks canAccessPrivateAgent before allowing a squad leader to be enqueued. Gate all EnqueueTaskForSquadLeader call sites: 1. enqueueSquadLeaderTask (comment trigger, assign trigger, backlog→todo) 2. triggerChildDoneSquad (child-done → parent squad leader) 3. autopilot.go (defensive comment; actor is always agent → always passes) Also fix validateAssigneePair's squad branch to run canAccessPrivateAgent on the squad leader, returning 403 'cannot assign to squad with private leader' when the actor lacks access. Thread actorType/actorID through notifyParentOfChildDone → dispatchParentAssigneeTrigger → triggerChildDoneSquad so the child-done path can enforce the private-leader gate. Regression tests: - Plain member blocked from create-issue to private-leader squad (403) - Plain member blocked from update-issue to private-leader squad (403) - Owner allowed to assign private-leader squad - Plain member comment on squad-assigned issue doesn't trigger private leader - Child-done by plain member doesn't trigger parent's private leader - Agent actor can still trigger private leader via comment Closes MUL-2860 Co-authored-by: multica-agent <github@multica.ai> * fix: add private-leader gate to autopilot save + dispatch paths - validateAutopilotAssignee squad branch: call canAccessPrivateAgent on the leader, returning 403 for unauthorized members at save time. - service/autopilot.go: add canCreatorAccessPrivateLeader helper that mirrors the handler-level canAccessPrivateAgent logic (agent creators pass; member creators must be owner/admin or agent owner). - Gate both dispatch paths (dispatchCreateIssue and dispatchRunOnly) with fail-closed check: if leader is private and creator lacks access, the run is skipped instead of triggering the private leader. Regression tests: - Plain member create autopilot to private-leader squad → 403 - Plain member update autopilot to private-leader squad → 403 - Owner create autopilot to private-leader squad → 201 - Owner-created autopilot dispatch → issue_created (positive) - Legacy plain-member-created autopilot dispatch → skipped (fail-closed) Co-authored-by: multica-agent <github@multica.ai> * test: add run_only legacy private-leader squad dispatch regression test Covers the dispatchRunOnly path explicitly, complementing the existing create_issue dispatch test. Both dispatch branches now have direct test coverage for the private-leader fail-closed gate. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
fd1cdf1801 |
fix project progress cache invalidation (#3016)
Co-authored-by: chener <chener@M5Air.local> |
||
|
|
e36f874c86 |
feat: add additive agent skill assignment (#3642)
* feat: add additive agent skill assignment Co-authored-by: multica-agent <github@multica.ai> * test: cover cross-workspace agent skill add Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
1aa742053b |
i18n: add japanese locale (MUL-2893) (#3538)
* i18n: add japanese locale * fix: spacing issues * refactor * fix(desktop): set <html lang> before paint to avoid JA Kanji font flash Switch the documentElement.lang sync from useEffect to useLayoutEffect so lang is committed before the first paint. Otherwise Japanese desktop users saw one frame of Kanji rendered with the Chinese-first fallback stack before the html[lang|="ja"] CJK override applied. Also fix the stale selector in the HTML_LANG comment (html[lang^="ja"] -> html[lang|="ja"]). Addresses review nits on MUL-2893. Co-authored-by: multica-agent <github@multica.ai> * fix(docs): tokenize the ideographic iteration mark in JA search Add U+3005 (々) to the Japanese search tokenizer character class. It sits just below the kana blocks, so words like 様々 / 日々 / 個々 previously dropped the mark and split awkwardly, hurting recall. Addresses a review nit on MUL-2893. Co-authored-by: multica-agent <github@multica.ai> * fix(i18n): restore ja locale parity after merging main Merging main brought new EN strings into agents/chat/onboarding/settings/ squads that the ja bundle (authored against an older snapshot) lacked, breaking the locales parity test. Add the Japanese translations for the new keys (workspace logo upload, agents runtime filter, chat session-history stop dialog, onboarding social_github, squad archived status) and drop the two renamed chat window keys (active_group / archived_group) that EN removed in favour of history_group. Fixes the failing @multica/views parity.test.ts on the FE CI for MUL-2893. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
03961206ff |
docs(squad): correct stale "four status buckets" comments to five (#3640)
Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
a6b83fef41 |
fix(agents): surface archived status for retired agents (#3608)
Retired agents (agent.archived_at set) previously read as offline across the agent dot, hover card, detail badge, and squad member list — a leftover online runtime row could even make them look reachable. Add a dedicated archived presence/status that wins over every runtime/task signal so a retired agent never reads as live or merely offline. - Add archived to AgentAvailability and SquadMemberStatusValue unions - Short-circuit deriveAgentPresenceDetail before runtime/task scan - Backend deriveSquadMemberStatus returns archived instead of offline - Render gray Archive dot/label; skip workload + reassign affordances - en/ko/zh-Hans locale strings |
||
|
|
700cd97407 |
feat(workspace): add per-workspace logo upload (#2760)
Adds avatar_url column to workspace, threads it through the API +
WorkspaceAvatar component, and adds a click-to-upload editor in the
workspace settings tab. Mirrors the squad avatar pattern (migration 086);
UI strings use "logo" while the schema/code uses avatar_url for codebase
consistency with user.avatar_url and squad.avatar_url.
- migration 093: ALTER TABLE workspace ADD COLUMN avatar_url TEXT
- UpdateWorkspace SQL + handler accept avatar_url (auth gated to
owner/admin at the router via RequireWorkspaceRoleFromURL)
- WorkspaceAvatar renders <img> when avatar_url is set, falls back to
the initial-letter span otherwise
- workspace-tab.tsx adds a 16x16 click-to-upload logo editor at the
top of the general settings card, using useFileUpload + accept=
image/png,image/jpeg,image/webp (server stores under workspaces/{id}/)
- en + zh-Hans settings i18n strings added
Co-authored-by: Matt Voska <voska@users.noreply.github.com>
|
||
|
|
674be86add |
fix(tasks): cancel autopilot run_only & quick_create tasks (MUL-2827) (#3615)
CancelTaskByUser (POST /api/tasks/{taskId}/cancel) keyed cancellation off
issue_id / chat_session_id alone, so any task whose only source link was
autopilot_run_id (run_only autopilots) or quick_create context fell into the
dead else branch and 404'd with "task not found" — even though the task was
visible (and showed a cancel X) on the agent Activity tab.
Enforce tenancy uniformly through the task's owning agent instead: agent_id is
NOT NULL on every task row (ON DELETE CASCADE), and agents are workspace-scoped,
so GetAgentTaskInWorkspace (task JOIN agent ON workspace) is a single tenant
guard that works regardless of which optional source FK is set — including
orphan tasks whose autopilot_run_id was SET NULL after the autopilot was
deleted. Privacy layers on top: chat tasks stay creator-only, and every other
task mirrors the agent Activity / snapshot private-agent visibility gate via
canAccessPrivateAgent so the id-only endpoint is never more permissive than the
surface that exposes the task.
Tests cover run_only (same-ws success, cross-ws 404 no-mutation), quick_create,
retry clones, issue-task regression, chat non-creator 403, and private-agent
plain-member 403.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
|
||
|
|
0f9d9d1494 |
fix(skills): align Go/TS frontmatter coercion for non-scalar values (#3614)
The Go SKILL.md frontmatter parser unmarshalled into a {Name,Description}
string struct, so a non-scalar value (a list/map written where a scalar
belongs) made the whole decode fail and dropped even a valid sibling
`name`. The TS parser instead kept the name and JSON-encoded the value,
so the file-viewer (TS) and the import path (Go) could disagree about
the same SKILL.md.
Decode into a generic map and coerce per key on the Go side, mirroring
the TS coercion (scalars -> literal form, sequences/mappings -> JSON), so
both sides produce identical results and a structured value never
discards a sibling key. Rename ParseFrontmatter -> ParseSkillFrontmatter
to remove the cross-language name clash with the TS parseFrontmatter
(which returns {frontmatter, body}), and drop the unused TS
parseSkillFrontmatter export.
Add parity tests for sequence/mapping values plus name-only,
description-only, leading-blank-line and triple-dash-in-body edge cases
on both sides.
Follow-up to #3543 / MUL-2842.
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
|
||
|
|
801c201d4c |
fix(skills): parse multi-line YAML frontmatter in SKILL.md (#3495) (#3543)
Three independent line-based frontmatter parsers only handled single-line `description: value`, so a YAML block scalar (`description: |`) collapsed to the literal "|" and the rest of the description was dropped before it ever reached the database. Replace all three with real YAML decoders that understand block scalars, folded scalars and quoted values: - server/internal/skill: shared ParseFrontmatter via gopkg.in/yaml.v3, used by both the handler import path and daemon local-skill discovery - packages/core/skills: shared parseFrontmatter via the yaml package - file-viewer renders multi-line frontmatter values (whitespace-pre-wrap) Both parsers fall back to empty values on malformed YAML, preserving the previous non-fatal behaviour. |
||
|
|
dd4d58f20e |
feat: add skill search CLI (#3601)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
2b2888c23a |
Handle duplicate skill imports as structured results (#3599)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
3c8645e546 |
feat(cli): add squad member set-role (#3583)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
4ae4722ef0 |
fix(comments): preserve direct parent on replies (#3579)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
2cf8107fc8 |
feat(email): support implicit TLS (SMTPS/465) for SMTP relay (MUL-2768) (#3340)
* feat(email): support implicit TLS (SMTPS/465) for SMTP relay The SMTP relay previously only did opportunistic STARTTLS: it dialed plaintext and upgraded if the server advertised STARTTLS. Providers that only offer implicit TLS on port 465 and do not advertise STARTTLS (e.g. Aliyun enterprise mail) could not be used as a relay at all. Add an SMTP_TLS env var: - unset / starttls (default): unchanged STARTTLS-upgrade behavior. - implicit / smtps / ssl: dial with tls.DialWithDialer (SMTPS). Implicit TLS is auto-enabled when SMTP_PORT=465 and SMTP_TLS is unset, so the common case works with no extra config. The startup log line now reports the negotiated mode (starttls / implicit-tls). Co-authored-by: multica-agent <github@multica.ai> * feat(email): plumb SMTP_TLS through selfhost compose, warn on unknown values The backend reads SMTP_TLS but docker-compose.selfhost.yml never forwarded it, so SMTP_TLS=implicit on a non-standard port (or an explicit starttls override on 465) silently did nothing inside the container. Add it to the backend.environment block. Also log a one-line warning when SMTP_TLS is set to an unrecognized value (e.g. "tls"/"true"/"on"), which would otherwise fall through to STARTTLS and fail to dial a 465 SMTPS port with no startup hint. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(email): cover SMTP_TLS precedence and alias resolution Table-driven test over NewEmailService asserting the implicit-TLS decision: 465 auto-enables implicit; explicit starttls on 465 overrides auto-detect; implicit/smtps/ssl aliases (case-insensitive, whitespace-trimmed) force SMTPS on any port; unknown values fall back to starttls. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs: document SMTPS / SMTP_TLS support, drop "465 unsupported" Port 465 implicit TLS is now supported, so the five places that said it was unsupported are wrong. Replace those sentences, add an SMTP_TLS row to the environment-variables tables (EN + ZH), and add a copy-pasteable SMTPS env block to the auth-setup pages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: guofengchang <guofengchang@cumulon.com> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9aa8ba0191 |
fix(runtimes): self-host daemon setup URLs (MUL-2804) (#3474)
Expose self-host daemon setup URLs from /api/config at runtime so the Add computer dialog renders the operator's own server/app domains, while Multica Cloud defaults stay unchanged. Fixes #3013. |
||
|
|
973a43923f |
fix(comments): revert since-delta to issue-wide, steer to parent thread first (#3535)
#3509/#3523 scoped the comment-trigger since-delta count to the triggering
thread, so an agent resuming a busy issue only saw "+N in this thread" and
lost visibility of new comments in other threads. Revert the count to
issue-wide (every thread), keeping the trigger-comment + agent-own
exclusions, and reshape the warm-path hint to:
- report the issue-wide new-comment volume,
- steer the agent to read the triggering (parent) thread FIRST
(`--thread <trigger> --since`, or `--tail 30` for full context),
- demote the issue-wide `--since` catch-up to an only-if-needed fallback
("don't read them all blindly").
Also fixes the now-stale "scoped to the triggering thread" wording in the
resumed-session no-delta hint (it's issue-wide zero now).
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
|
||
|
|
d1c7d478e1 |
MUL-2785: clarify thread-scoped comment delta (#3523)
Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
5aa4fb7487 |
MUL-2760: feat(i18n): add Korean locale support (#3369)
* feat: add korean locale support * feat(i18n): localize Korean landing page * fix(i18n): refine Korean landing copy * fix(i18n): refine Korean translations * fix(i18n): translate Korean landing subpages * fix(i18n): route Korean landing docs links * fix(i18n): add Korean use case content * fix(i18n): polish Korean locale copy * fix(i18n): improve Korean landing copy * fix(onboarding): persist Korean helper artifacts Co-authored-by: multica-agent <github@multica.ai> * fix(web): add use case locale fallback Co-authored-by: multica-agent <github@multica.ai> * Align Korean pull requests wording Co-authored-by: multica-agent <github@multica.ai> * fix(i18n): dedupe docs href helper Co-authored-by: multica-agent <github@multica.ai> * fix(i18n): localize changelog dates Co-authored-by: multica-agent <github@multica.ai> * fix(docs): prerender Korean fallback pages Co-authored-by: multica-agent <github@multica.ai> * fix(docs): align fallback hreflang metadata Co-authored-by: multica-agent <github@multica.ai> * fix(i18n): preserve Chinese CJK font fallback order Co-authored-by: multica-agent <github@multica.ai> * chore(onboarding): update localized comment wording Co-authored-by: multica-agent <github@multica.ai> * test(i18n): harden CJK font fallback assertions Co-authored-by: multica-agent <github@multica.ai> * fix(docs): keep Chinese font fallbacks first Co-authored-by: multica-agent <github@multica.ai> * test(i18n): harden locale fallback coverage Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
9616d78e47 |
MUL-2785: optimize resumed comment reads (#3509)
* feat(comments): skip default thread read on resumed comment sessions Co-authored-by: multica-agent <github@multica.ai> * fix(comments): scope since delta to trigger thread Co-authored-by: multica-agent <github@multica.ai> * chore(comments): address thread delta review nits Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
75b5be3f8e |
feat(comments): roots-only thread stats + summary projection for comment list (MUL-2809) (#3505)
* feat(comments): roots-only thread stats + summary projection for comment list Enrich the roots_only read so each root carries reply_count (recursive descendant count) and last_activity_at (MAX created_at over the subtree), letting an agent triage which thread to open without fetching any replies. Add an orthogonal summary=true projection (--summary) that clips each returned comment's content to a fixed budget and sets content_truncated, so an agent can scan a list cheaply before pulling a full body. It composes with every read mode (default, since, thread, recent, roots_only). New response fields are optional (omitempty) and only populated for the agent-facing query params, so the default response shape is unchanged for the desktop/web and existing CLI callers. Co-authored-by: multica-agent <github@multica.ai> * test(comments): cover roots_only + summary composition end-to-end The summary projection composing with roots_only is the spec's headline "table of contents" read, but it was only exercised at the CLI param- forwarding level — no handler test asserted that a roots_only response both clips content AND keeps reply_count / last_activity_at. A refactor moving the clip into a per-mode branch would silently break that composition with no failing test. Add TestListComments_RootsOnlySummaryComposes: a long root + a reply, read via roots_only=true&summary=true, asserting the root is clipped (content_truncated=true) while its subtree stats still surface. Co-authored-by: multica-agent <github@multica.ai> * refactor(comments): address review nits on roots stats + summary - ListRootComments[Since]ForIssue: scope the recursive membership walk to a selected_roots CTE (the @row_limit page, with the @since cut applied up front) so stats are only computed over the subtrees of the roots actually returned, instead of every thread in the issue. - summarizeContent: scan by rune and stop at the budget+1th rune instead of allocating a full []rune for the whole body, so a pathologically long comment costs only the budget under summary mode. Add a multi-byte (CJK) test to lock rune-boundary clipping. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
ca1ea5716e |
fix(server/child-done): trigger agent parent assignee on child done (MUL-2808) (#3507)
Remove the agent-path self-trigger guard in triggerChildDoneAgent so a child going done wakes its parent agent even when the same agent owns both — a serial sub-task handoff across two different issues, not a loop. Runaway re-triggering stays bounded by HasPendingTaskForIssueAndAgent. Squad path unchanged. Closes #3374. |
||
|
|
c730e906b9 | feat(cli): add roots-only issue comment listing (MUL-2805) (#3288) | ||
|
|
3187bbf90c |
feat(comments): re-add since-delta + cold-start thread read + parent-root write normalization (#3494)
* feat(comments): since-delta new-comment hint + default-on comment session resume (#3432) * feat(db): add unresolved comment count + list filter queries Add CountUnresolvedComments (excludes the agent's own comments) and ListUnresolvedCommentsForIssue. Both are additive — existing callers stay on the unfiltered queries — so old clients are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): support unresolved-only comment listing Wire an additive `unresolved` query param into ListComments. Defaults off so an old CLI that never sends it gets unchanged behavior; only true/1 enable it. Rejects combining unresolved with thread/recent (whole-issue filter vs navigation models). Includes filter + count query tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): plumb unresolved count + thread root into claim, gate comment resume Populate trigger_parent_id (thread root of the trigger comment) and unresolved_count (excludes the agent's own comments) on comment-triggered claim responses. Both fields are omitempty so old daemons ignore them. Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION (default off): resumed comment turns can inherit the prior turn's "Done." final message, so this stays an explicit rollout switch. The runtime-match and poisoned-session guards still apply regardless of the flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(daemon): inject unresolved-comments hint + resolve step into agent brief Add a shared BuildUnresolvedCommentsHint helper rendered on both the per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It ships only the count and the relevant CLI call — never comment bodies — so the server stays cheap. Thread case points at --thread <root>; issue case points at --unresolved. Suppressed when the count is 0. Also add a workflow step telling the agent to `multica comment resolve <thread-root>` once a thread is fully handled, so the unresolved set converges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add comment list --unresolved and comment resolve command Add an --unresolved filter to `issue comment list` (wired to the server's unresolved param, rejected when combined with --thread/--recent) and a top-level `comment resolve <id>` command that POSTs to the existing /api/comments/{id}/resolve endpoint, letting an agent close threads it has fully handled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(comments): since-delta new-comment hint + default-on comment resume Simplifies the comment-triggered agent flow down to what's actually needed: - New-comment awareness is now a pure time delta: the claim response carries new_comment_count + new_comments_since (anchored on the prior run's started_at, never completed_at so a long run can't miss comments). The per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s) since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the two surfaces can't drift. Cold start (no prior run) falls back to a plain read. - Comment-triggered tasks resume the prior session by default (same runtime), dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS comment" prompt guard defends against inheriting the prior turn's "Done." marker; GetLastTaskSession still excludes poisoned sessions. - Drops the resolved-based machinery from the first draft: CountUnresolvedComments / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved` flag, the `multica comment resolve` command, and the resolve workflow step. - Removes the verbose cursor-pagination paragraph from the comment prompt; the --thread/--recent/--since flags stay in the CLI/API, just no longer explained inline every turn. Compatibility: new claim fields are omitempty (old daemons ignore them). Comment resume is default-on and affects even old daemons, which already consume prior_session_id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(comments): collapse reply parent_id to thread root on write Comment threads are a 2-level model (root + flat replies, like Linear/Slack), enforced today only by the UI and the agent path — the CreateComment handler stored whatever parent_id it was handed, and the agent-side flatten walked just one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a recursive walk to the parent_id=NULL root) and run both write paths (handler.CreateComment, service.createAgentComment) through it, so every stored reply's parent_id IS its thread root. Readers can now treat parent_id as the thread root without re-walking. The agent-drift guard still compares the raw parent_id to the trigger comment before normalization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(comments): cold-start reads triggering thread, warm keeps --thread pointer The since-delta rework dropped the thread-first read on the COLD path: a first-time agent fell back to the flat `comment list` dump (oldest-first, cap 2000), burying the trigger's context in ancient chatter. Point cold start at the triggering conversation instead via a shared BuildColdCommentsHint (`--thread <trigger> --tail 30` + a --recent pointer for cross-thread background). On the WARM path, --since is a pure time delta and can miss the triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a --thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via the shared helpers so they cannot drift (PR #2816 rule). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
270fb6aa73 |
MUL-2792 fix(agent): preserve skills in update/archive/restore response (#3464)
* MUL-2792 fix(agent): preserve skills in update/archive/restore response (#3459) agentToResponse always initialises Skills as []; the mutation handlers relied on the caller to refresh it, but only GetAgent and ListAgents actually did. UpdateAgent / ArchiveAgent / RestoreAgent therefore returned "skills": [] regardless of what the agent_skill junction table contained. The DB write path was never wrong — skills weren't actually deleted — but the misleading response (and its matching agent:status / archived / restored WS broadcast) scared users into manually re-running `agent skills set` and risked scripted clients writing the empty set back as truth. Extract the existing GetAgent skill-reload block into attachAgentSkills and call it from the three buggy handlers. Add regression tests that attach skills, hit each mutation endpoint, and assert both the response and the junction table. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): attach skills before env/template broadcasts (#3459) Two follow-up sites flagged in PR #3464 review that shared the same "agentToResponse zeroes Skills, callers forget to reload" pattern as the mutation handlers: - agent_env.go: the agent:status broadcast after UpdateAgentEnv used a bare agentToResponse, so subscribers saw skills wiped on every env rotation. HTTP body is AgentEnvResponse so the response itself is unaffected, but the WS event still misleads any cache that ingests it. - agent_template.go: CreateAgentFromTemplate attaches imported and extra skills inside the tx, then builds the response/agent:created broadcast without reloading them — so callers (and any client tracking the create event) see the freshly created agent as skill-less despite the template having just imported them. Both call sites now reuse attachAgentSkills introduced for UpdateAgent. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
fa076d38f2 |
MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime (#3450)
* MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime The MCP config tab (#3419) lets admins save mcp_config on an agent, and recent work (#3439) plumbed it through the three ACP runtimes. OpenClaw still ignored the field, leaving the Tab silently inert for any OpenClaw-backed agent. Translate the agent's Claude-style `{"mcpServers": {...}}` into the per-task OpenClaw wrapper's `mcp.servers` block — OpenClaw resolves MCP via its own config schema rather than ExecOptions, so the existing OPENCLAW_CONFIG_PATH preparer is the right seam. Fail closed on malformed JSON / entries missing `command` or `url`, matching the fail-closed posture the preparer already uses for the agents.list step. Null / absent mcp_config leaves the wrapper free of an `mcp` key so the user's global mcp.servers flows through untouched; an explicit empty managed set (`{}` / `{"mcpServers":{}}`) is honoured as "admin saved no servers" mirroring `hasManagedCodexMcpConfig`. Strict-mode replacement (drop user-only servers entirely) would require OpenClaw to do a per-key replace rather than a deep merge at `mcp.servers`; the comment documents that caveat rather than relying on undocumented behaviour. Also adds `openclaw` to `MCP_SUPPORTED_PROVIDERS` so the MCP Tab actually surfaces in the agent overview pane, and pins the new visibility case with a renderPane test. Co-authored-by: multica-agent <github@multica.ai> * MUL-2778 fix(agent): make openclaw mcp_config strict-replace via sanitized snapshot Elon flagged on #3450 that the previous wiring let user-only mcp.servers leak through the wrapper's `$include` of the live user config: deep-merge at `mcp.servers` keeps user-only names, and the strict-empty case (`{ "mcpServers": {} }`) silently inherited user globals. Switch the strict-replace path to write a sanitized snapshot of the user's fully resolved config (via `openclaw config get --json`) with the `mcp` block stripped, then have the wrapper `$include` the snapshot instead of the live user file. With the user's `mcp` gone from the $include resolution, the wrapper's `mcp.servers` is the only definition the embedded OpenClaw sees — managed only, including the explicit empty set. The snapshot lives in envRoot at 0o600 alongside the wrapper so the GC reaper sweeps it with the rest of the task scratch, and no extra OPENCLAW_INCLUDE_ROOTS entry is needed (same-dir $include). Fail-closed on `config get --json` errors so the daemon never silently falls back to the leaky $include path. The inherit branch (null mcp_config) still uses the live user file directly — no extra CLI roundtrip and no snapshot is written. New tests pin the contract Elon's review required: - TestPrepareOpenclawConfigStrictReplacesUserMcpServers: user has global_one + shared, managed has shared + managed_only → wrapper has exactly {shared (managed value), managed_only}; global_one does NOT leak; snapshot file has the user's `mcp` stripped while preserving gateway / providers / API keys. - TestPrepareOpenclawConfigStrictEmptyManagedSetDropsUserMcp: empty managed set drops user's global_one (both `{}` and `{"mcpServers":{}}` cases). - TestPrepareOpenclawConfigNullMcpConfigKeepsUserInclude: null path inherits the live user config, writes no snapshot, makes no extra CLI call. - TestPrepareOpenclawConfigFailsClosedOnResolvedConfigError: errors during `config get --json` surface; no stale wrapper or snapshot. - TestPrepareOpenclawConfigManagedSetFreshInstall: fresh install with managed mcp_config skips the snapshot dance entirely. Also tightens en + zh-Hans MCP Tab copy to mention OpenClaw goes via the per-task wrapper, and to use OpenClaw's own `transport` field rather than Claude's `type` for HTTP/SSE entries. Co-authored-by: multica-agent <github@multica.ai> * MUL-2778 fix(agent): narrow openclaw snapshot strip to mcp.servers only Elon's third-round must-fix: the previous strict-replace snapshot deleted the entire `mcp` block, which wiped out non-server settings under `mcp` like `sessionIdleTtlMs`. Those are documented OpenClaw config keys (https://docs.openclaw.ai/gateway/configuration-reference#mcp) outside the MCP Tab's scope — the agent's saved mcp_config only manages server definitions, so other `mcp.*` tuning the user set must survive. Replace the blanket `delete(resolved, "mcp")` with a stripUserMcpServers helper that: - deletes only `mcp.servers` when `mcp` is an object - drops the parent `mcp` key only when the object is empty after the strip (so we don't emit `mcp: {}` placeholders) - leaves non-object `mcp` values untouched (we only know how to strip servers from the documented shape) Pinned with TestPrepareOpenclawConfigStrictPreservesNonServerMcpKeys: user resolved has both `mcp.sessionIdleTtlMs: 300000` and `mcp.servers.global_one`; after the strict path runs the snapshot keeps the TTL and drops the servers map, and the wrapper's `mcp.servers` is exactly the managed set with no leak. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
d90732750f |
Revert "feat(comments): since-delta new-comment hint + default-on comment ses…" (#3455)
This reverts commit
|
||
|
|
ee4ec3b76d |
MUL-2784 fix(daemon): cleanup sidecar tree (.agent_context / .multica / provider skills) after local_directory tasks (#3444)
* fix(daemon): cleanup .agent_context / .multica / provider skill sidecars after local_directory tasks (MUL-2784) PR #3438 (MUL-2753) only restored CLAUDE.md / AGENTS.md / GEMINI.md to their pre-task bytes; the sidecar tree writeContextFiles seeds (.agent_context/, .multica/, .claude/skills/, .github/skills/, .opencode/skills/, skills/, .pi/skills/, .cursor/skills/, .kimi/skills/, .kiro/skills/, .agents/skills/, fallback .agent_context/skills/) was explicitly deferred to this follow-up. In local_directory mode the agent's workdir is the user's repo, so each task accumulates one more layer of those directories in the user's tree. Plan A: track every file/dir Prepare creates inside workDir in a sidecarManifest written to envRoot/.multica_sidecar_manifest.json (daemon scratch — never in the user's workdir). On local_directory teardown CleanupSidecars walks the manifest, removes the recorded files, then rmdir-iterates the recorded directories in reverse. Pre-existing files and directories are deliberately NOT recorded, so a user-installed .claude/skills/my-own-skill/ sibling — or any unrelated file the user keeps under .claude/, .github/, etc. — is preserved bit-for-bit. Non-empty rmdir fails ENOTEMPTY and is silently skipped, which is the signal that the user owns the directory. Daemon wiring lives next to the existing CleanupRuntimeConfig defer in runTask: runtime brief first, sidecars second. Cloud-mode runs still write a manifest for symmetry but never trigger the cleanup (the GC loop wipes envRoot wholesale). Tests (sidecar_manifest_test.go) cover the round-trip invariant per the issue's acceptance criteria: - empty workdir → Prepare → Cleanup → empty workdir, byte-exact, for every file-based provider (claude, codex, copilot, opencode, openclaw, hermes, pi, cursor, kimi, kiro, antigravity, gemini), - user's .claude/skills/my-own-skill/ (and equivalents per provider) survives Cleanup intact, - unrelated user files under .claude/, .github/, etc. survive, - three repeated cycles do not accumulate any orphan state, - project_resources branch (.multica/project/resources.json) is also reversible, - recordWriteFile refuses to record pre-existing files, - recordMkdirAll refuses to record pre-existing dirs, - Cleanup is a no-op when the manifest file is missing. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): refuse to overwrite pre-existing sidecar paths; pick collision-free skill slugs (MUL-2784 review) Addresses PR #3444 review (Elon): **Must-fix #1**: recordWriteFile used to overwrite pre-existing target files unconditionally and only skip the manifest record. That destroys user bytes at write time AND leaves the corrupted contents in place at cleanup time — the byte-exact contract the issue requires is violated on both halves. Fixed by making recordWriteFile detect any pre-existing entry (regular file, symlink, directory) via Lstat and return a sentinel errPathPreExists without touching the path. The user's bytes are preserved verbatim. For per-skill collisions (user's .claude/skills/issue-review/ vs Multica's "Issue Review"), writeSkillFiles now allocates a collision-free sibling slug via allocateCollisionFreeSkillDir: first attempt is the natural slug, then `<base>-multica`, `<base>-multica-2`, …, bounded at 64 attempts. Provider-native discovery still picks the skill up (every subdir under skillsParent is a distinct skill) and the user's path stays bit-for-bit intact. For Multica-only namespace files (.agent_context/issue_context.md, .multica/project/resources.json), the writer swallows errPathPreExists and continues — the runtime brief already carries every fact those files would, so a collision degrades to brief-only mode rather than destroying user content. **Must-fix #2**: Added byte-exact collision matrix tests covering every file-based provider (claude / codex / copilot / opencode / openclaw / hermes / pi / cursor / kimi / kiro / antigravity / gemini): - TestPrepareThenCleanupSidecarsSameSlugCollisionPerProvider: seeds user's `<provider>/skills/issue-review/SKILL.md` plus a private notes.md sibling, runs Prepare → Inject → Cleanup, asserts workdir snapshot is byte-identical to seed. - TestPrepareThenCleanupSidecarsIssueContextCollisionPerProvider: seeds user's `.agent_context/issue_context.md`, asserts round-trip preserves it. - TestPrepareThenCleanupSidecarsProjectResourcesCollisionPerProvider: same for `.multica/project/resources.json`. - TestPrepareThenCleanupSidecarsMultiSkillCollisionFreeAllocation: end-to-end check that the Multica skill lands at the collision-free sibling and Cleanup removes only the Multica side. - TestAllocateCollisionFreeSkillDir: directed unit test pinning the slug-bumping sequence. - TestRecordWriteFileRefusesToOverwritePreExistingFile (was TestRecordWriteFileSkipsPreExistingFile): flipped to assert the user's bytes survive and errPathPreExists is returned. - TestRecordWriteFileRefusesToOverwriteSymlinkOrDir: covers the Lstat path for non-file entries. **Should-fix**: CleanupSidecars used to swallow ANY non-ENOENT rmdir error as "user content present," silently dropping real I/O failures (EACCES, EPERM, EBUSY). Now it re-reads the directory after a failed rmdir via the new dirHasEntries helper — non-empty → silently skip (ENOTEMPTY, the intended branch); empty → genuine error, captured into firstErr and surfaced. Plus directed tests: - TestCleanupSidecarsSurfacesRealRmdirErrors - TestDirHasEntries Local verification: - go test ./internal/daemon/execenv/... — all green - go test ./internal/daemon/... — all green - go vet ./... — clean Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): surface original rmdir error when post-rmdir ReadDir also fails (MUL-2784 review) Addresses remaining PR #3444 review blocker (Elon): dirHasEntries used to return true when ReadDir failed with anything other than ENOENT, which made CleanupSidecars treat every locked / faulted directory as ENOTEMPTY and silently drop the original rmdir error. The v1 fix from the previous round closed the EACCES-on-empty-dir branch but missed the case where the chmod also blocks ReadDir — exactly the failure mode the review called out. Helper change: dirHasEntries now returns (hasEntries, ok bool): - (false, true) — dir exists and is empty (or missing, race-safe) - (true, true) — dir has user content (the ENOTEMPTY branch) - (_, false) — ReadDir failed (EACCES, ENOTDIR, EIO, …); the caller cannot tell ENOTEMPTY from a real error and MUST surface the original rmdir error CleanupSidecars switches on (ok, hasEntries): - !ok → surface the ORIGINAL rmdir error (not the ReadDir failure — that's diagnostic plumbing and would distract from the root cause) - ok && hasEntries → swallow silently (intended ENOTEMPTY branch; preserve user content) - ok && !hasEntries → surface the rmdir error (empty dir + EACCES / EPERM / EBUSY → genuine cleanup failure) Tests: - TestDirHasEntries: extended with a regular-file sub-case (ReadDir returns ENOTDIR) asserting (false, false). The v1 helper returned (true) here, hiding the bug. - TestCleanupSidecarsSwallowsMissingAndNonEmptyDirs: renamed from TestCleanupSidecarsSurfacesRealRmdirErrors. The old name claimed to test the surfacing path but never actually exercised it. - TestCleanupSidecarsSurfacesEACCESOnEmptyRecordedDir: chmod parent to 0o555 so rmdir(recorded) fails EACCES while ReadDir(recorded) still succeeds (empty). Asserts firstErr is non-nil and references both the recorded path and the rmdir branch. Skipped when running as root (chmod is bypassed for uid 0). - TestCleanupSidecarsSurfacesEACCESWhenReadDirFailsToo: the must-fix case — chmod parent 0o555 AND chmod recorded 0o000 so BOTH rmdir and ReadDir fail. The surfaced error must be the ORIGINAL rmdir failure, not the ReadDir one. Skipped on uid 0. Local verification: - go test ./internal/daemon/execenv/... — all green - go test ./internal/daemon/... — all green - go vet ./... — clean Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
90a737fc7e |
fix(daemon): retry terminal task callbacks on transient errors (MUL-2780) (#3443)
CompleteTask / FailTask used to be fire-once. A 1-second upstream 502 burst would drop the call, then the immediate fail-fallback also 502'd, leaving the task stuck in `running` forever and showing the agent as "still working" in the UI. Add a bounded retry around the two terminal callbacks: 4s, 8s, 16s, 32s, 64s backoff schedule (5 retries, ~124s ceiling), retrying only on transient errors (5xx, 408, 429, transport-level) and bailing immediately on permanent 4xx. Also fix a latent bug where a transient complete failure would silently downgrade a successful run to a fail: the fallback now triggers only on permanent errors. Server-side CompleteTask / FailTask are already idempotent on "already terminal", so replays from a retry are safe even if the prior 502'd response was actually persisted. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
90ddfb04e2 |
feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777) (#3441)
* feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777, #3433) When self-hosters set DISABLE_WORKSPACE_CREATION=true, POST /api/workspaces returns 403 for every caller and the UI hides every "Create workspace" affordance (sidebar, modal, /workspaces/new page, onboarding Step 2). This closes the gap where ALLOW_SIGNUP=false still let any signed-in user open an isolated workspace the platform admin couldn't see. - server: new Config.DisableWorkspaceCreation, gate in CreateWorkspace, workspace_creation_disabled in /api/config, Go tests. - frontend: new workspaceCreationDisabled in configStore, hide sidebar entry, swap NewWorkspacePage / CreateWorkspaceModal / onboarding StepWorkspace to a "creation disabled, ask for invite" state when the flag is on, EN + zh-Hans locale strings. - ops: .env.example, docker-compose.selfhost, helm values + configmap, SELF_HOSTING.md, SELF_HOSTING_ADVANCED.md, environment-variables docs (EN + zh). Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): drive create path off workspaceCreationAllowed (#3433) PR #3441 review: when DISABLE_WORKSPACE_CREATION=true and the user already has a workspace, StepWorkspace still walked the resume copy (`headline_resume` / `lede_resume` mentioning "or start another") and `creatingActive` ignored the flag, leaving a stale clickable create CTA possible if /api/config arrived late. Refactor StepWorkspace to derive a single `workspaceCreationAllowed` boolean from the config store. It now drives: - Initial `mode` state (defaults to "existing" when disabled + reusing so the CTA is pre-armed for the only valid action). - `creatingActive` so the footer CTA cannot fall back into the create branch even mid-render. - Eyebrow / headline / lede strings — adds `creation_disabled_{eyebrow,headline,lede}_resume` (EN + zh-Hans) for the disabled + reusing variant. Tests: cover the three reachable shapes — flag off + no existing, flag on + no existing, flag on + existing. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
03f70209c4 |
fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (#3438)
* fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (MUL-2753) InjectRuntimeConfig previously called os.WriteFile unconditionally, which truncated whatever file lived at the same path. For the local_directory project_resource flow the workdir is the user's own repo, so the agent silently destroyed any repo-level CLAUDE.md / AGENTS.md / GEMINI.md the first time it ran in that directory, and the daemon's local-directory cleanup explicitly skips the user's path so the file was never restored. Write the brief inside a marker block instead: <!-- BEGIN MULTICA-RUNTIME (auto-managed; do not edit) --> ...brief... <!-- END MULTICA-RUNTIME --> writeRuntimeConfigFile handles three states: - file missing -> create with just the marker block, - file present, no marker block -> append the marker block at the end (preserves user-authored content above), and - file present, marker block already there -> replace the block body in place so repeated runs don't grow the file unboundedly. This is the short-term fix called out on MUL-2753. The sidecar question (.agent_context/, .claude/skills/, .multica/project/resources.json) is left for a follow-up — those files don't overwrite user content, just litter the workdir. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): cleanup runtime config marker block after local_directory tasks (MUL-2753) Address Elon's review on PR #3438: 1. Add `CleanupRuntimeConfig` and wire it into the daemon's task path so `local_directory` runs excise the marker block on the way out. Without it, a user's subsequent manual `claude` / `codex` / `gemini` run in the same directory picks up the previous task's stale brief (issue id, trigger comment id, reply rules) and acts on the wrong context. Cloud workspace runs skip the cleanup — their scratch workdir is wiped by the GC loop anyway. 2. If excising the block would leave the file empty / whitespace-only, the file is removed so we don't leave behind a stub the user has to delete by hand. Surviving user content is preserved byte-for-byte. 3. Harden the marker parser: search for the end marker strictly after the begin marker. The previous `strings.Index` pair mishandled two malformed cases — - a stray end marker before any begin (e.g. user pasted a documentation snippet showing the wire format) would cause every run to stack another block, growing the file unboundedly; - a half-block left by a previous crashed run would cause every subsequent run to append a fresh block beneath the half-block. The `locateMarkerBlock` helper now anchors the end search past the begin offset, and treats "begin found, no end after" as "block runs to EOF" so the next write replaces it cleanly. Centralised the provider→filename mapping in `runtimeConfigPath` so Inject and Cleanup can't drift past each other when a new provider is added. Tests cover: parser hardening (stray-end-before-begin idempotency, half-block recovery), Cleanup happy path / file removal / no-op cases / malformed half-block / per-provider mapping, and an end-to-end inject→cleanup round trip that locks in byte-identical restoration of the user's pre-injection file. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): byte-exact inject/cleanup round trip for runtime config (MUL-2753) Address Elon's second-round review on PR #3438. The previous cleanup relied on `TrimRight + "\n"` for trailing newlines and `TrimSpace == ""` for file removal — both compensated for the inject path's "normalise trailing newlines so there's always exactly `\n\n` before the block" step, but they did so by mutating the user's bytes. The result was a real diff on three boundary cases: - file ended without a newline (`rules`) → cleanup added one; - file ended with two or more newlines (`rules\n\n`) → cleanup collapsed to a single newline; - file pre-existed but was empty / whitespace-only → cleanup deleted it. Reshape the contract so the bytes inject adds are the exact bytes cleanup removes, with no user-byte mutation in between: - Define `runtimeManagedSeparator = "\n\n"` as a fixed managed separator that inject always inserts (unconditionally — including for files that already end in two or more newlines) between pre-existing user content and the marker block. - Inject's missing-file branch still writes the block alone (no separator); that absence is the marker Cleanup uses to identify "we created this file from scratch" and is the only condition under which Cleanup is allowed to `os.Remove` the file. - Cleanup detects `HasSuffix(pre, runtimeManagedSeparator)` and strips exactly those bytes; whatever remains is written back verbatim with no `TrimRight` / `TrimSpace`, so the pre-injection bytes survive exactly. The replace-in-place branch is untouched — the managed separator established by the first inject lives in pre and survives across subsequent runs, so byte-exactness is preserved through arbitrary inject→inject→cleanup chains. Tests: - `TestInjectThenCleanupRoundTripByteExactBoundaries` parameterises 9 seed shapes (missing file, empty, whitespace-only, no trailing newline, one trailing newline, two trailing newlines, many trailing newlines, CRLF line endings, no final newline with embedded blank lines) and asserts byte-identical round trip across two full cycles. - `TestInjectReplaceThenCleanupRestoresByteExact` covers the replace-in-place branch for the same boundary seeds. - `TestWriteRuntimeConfigFileAlwaysInsertsFixedManagedSeparator` pins the new invariant at the source: regardless of seed shape, inject emits `<seed><\n\n><marker block>` with no normalisation. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> |