multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 13:29:44 +02:00

Author	SHA1	Message	Date
LinYushen	f6ac53a967	fix: squad leader no_action must not post comment on comment-triggered path (#2573 ) PR #2564 only added IsSquadLeader handling to the assignment-triggered workflow path and the Output section. When a squad leader is triggered by a comment (the common case for re-evaluation), the comment-triggered workflow path had NO squad leader special handling, so the model still posted comments announcing no_action/silence. Changes: - runtime_config.go: Add IsSquadLeader check to comment-triggered step 4 with explicit prohibition against posting no_action announcement comments - runtime_config.go: Strengthen Output section from 'may exit silently' to 'MUST exit without posting any comment' with explicit DO NOT examples - runtime_config.go: Strengthen assignment-triggered step 5 similarly - prompt.go: Add squad leader no_action rule to per-turn comment prompt when trigger author is an agent and agent instructions contain the Squad Operating Protocol marker - Add tests for both the per-turn prompt and CLAUDE.md generation Fixes MUL-2168 Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 12:36:06 +08:00
Bohan Jiang	334d9cdd02	fix(squad): skip leader when a member @mentions anyone (MUL-2170) (#2569 ) * fix(squad): skip leader on comment when a member @mentions any agent (MUL-2170) When a human commenter routes an issue directly at a specific agent via [@Name](mention://agent/<id>), the squad leader was still being woken up to evaluate the same comment. The leader's only real options were to re-delegate to the agent the member already named or to record no_action — both of which produce queue noise without changing the outcome. This skips the leader-enqueue path entirely when: - the assignee is a squad, - the comment author is a member, AND - the comment body contains at least one agent mention. Agent-authored comments are intentionally exempt: when an agent posts an update that @mentions another agent, the leader still needs to coordinate the thread. The existing leader-self-trigger guard is preserved. Only the current comment's body is inspected — parent (thread root) mentions are not inherited here. Tests cover the helper (mentions parsing) plus the integration matrix: member plain / member @member / member @non-leader-agent / member @leader / agent @agent / leader-self. Co-authored-by: multica-agent <github@multica.ai> * test(squad): exercise full CreateComment path for leader-skip rule (MUL-2170) Adds an integration test that drives the HTTP-layer CreateComment handler (not just the helper) to lock the call-site wiring: a member top-level comment with an @agent skips the squad leader, and a subsequent plain reply in the same thread DOES wake the leader — the parent's @agent mention must not be inherited into the leader-skip decision. Picks up a non-blocking review note on PR #2569. Co-authored-by: multica-agent <github@multica.ai> * fix(squad): skip leader on any explicit member mention, not only @agent (MUL-2170) Broaden the leader-skip rule for squad-assigned issues: a member comment that explicitly @mentions anyone — @agent, @member, @squad, or @all — counts as deliberate routing and the squad leader stays out. Issue cross-references (mention://issue/...) are not routing and still trigger the leader as before. Per Bohan's follow-up on MUL-2170 — @member should suppress the leader for the same reason @agent does: the human has already pointed at a specific recipient, so a leader turn would just be observation noise. Helper renamed commentMentionsAnyAgent → commentMentionsAnyone with explicit handling of all four routing mention types. Existing call-site wiring (current-comment-only, agent-author exemption, leader self-trigger guard) is unchanged. Tests updated and extended to cover the full routing matrix: @member / @squad / @all / @issue (cross-ref) plus the @agent variants already covered. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 12:22:10 +08:00
fr00st	cc9fbd3db0	Fix stale Done replies on comment follow-ups (#2495 ) * fix: avoid stale done replies on comment follow-ups * fix: avoid inlining runtime brief for Hermes ACP * fix: address comment follow-up review feedback	2026-05-14 12:00:04 +08:00
LinYushen	7a1284128d	fix: allow squad leader to exit silently on no_action without posting a comment (#2564 ) The runtime prompt's Output section unconditionally required all tasks to post a comment via 'multica issue comment add', which conflicted with the squad leader protocol that says to 'exit silently' on no_action. Changes: - Add IsSquadLeader bool to TaskContextForEnv (detected via Squad Operating Protocol marker in agent instructions) - Relax the Output section and assignment-triggered workflow step 5 to allow squad leaders to exit with only a 'multica squad activity' call when the outcome is no_action Fixes MUL-2168 Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 11:33:15 +08:00
Bohan Jiang	21b49eb59b	fix(cli): resolve squad assignees in issue create/update/assign (MUL-2165) (#2551 ) * fix(cli): resolve squad assignees in issue create/update/assign (MUL-2165) The CLI assignee resolver only searched workspace members and agents, so a quick-create input like "assign to <SquadName>" silently fell through to "Unrecognized assignee: <SquadName>" in the issue description — even though squads are first-class assignees server-side and the prompt's whole point was to route the work for the user. Extend resolveAssignee / resolveAssigneeByID to also fetch /api/squads, teach the actor display lookup to render squad names in table output, update the quick-create prompt and runtime-config command listing to mention `multica squad list` alongside members and agents, and lock in the new behavior with tests. Co-authored-by: multica-agent <github@multica.ai> * fix(cli): gate squad assignee resolution behind an allowed-kinds set (MUL-2165) The earlier MUL-2165 fix taught resolveAssignee / resolveAssigneeByID to also return (squad, ...), but those helpers are shared. Project lead and issue subscriber callers were still using them, and their target schemas reject squads — project.lead_type has a DB CHECK constraint (server/migrations/034_projects.up.sql:10) and the subscriber handler's isWorkspaceEntity switch only knows member/agent (server/internal/handler/handler.go:414). So `multica project create --lead "<SquadName>"` and `multica issue subscriber add --user "<SquadName>"` would resolve to (squad, ...) and surface as a 500/403 server-side instead of a clean CLI-side resolution error. Thread an assigneeKinds set through the resolver and the pickAssigneeFromFlags helper. Issue create/update/assign/list pass `issueAssigneeKinds` (all three); project lead and subscriber pass `memberOrAgentKinds`. The squads fetch is skipped entirely when not allowed, and the not-found / no-match error wording adapts to the allowed kinds so it never mentions a type the caller cannot use. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 22:31:50 +08:00
Bohan Jiang	0345285b86	feat(quick-create): searchable actor picker + squad support (#2552 ) * feat(quick-create): searchable actor picker + squad support (MUL-2163) - Replaces the flat agent dropdown in the "Create with agent" modal with a searchable PropertyPicker that lists Agents and Squads in separate sections, so users can filter by name and pick a squad as the creator. - Persists the selection as (lastActorType, lastActorId), removing the agent-only lastAgentId field on the quick-create store. - Adds squad_id to the quick-create API request and stamps it onto the task's QuickCreateContext. The handler resolves the squad to its leader agent (re-using validateAssigneePair) and the daemon claim path injects the squad-leader briefing when the task carries a squad hint, matching the behavior of issue-bound squad tasks. Co-authored-by: multica-agent <github@multica.ai> * fix(create-issue): forward squad picks across manual→agent switch Manual mode → agent mode previously only carried `agent_id`, so picking a squad and then flipping to agent silently fell back to the persisted actor / first visible agent and lost the user's choice. Carry `squad_id` on the same branch so the agent panel honors the squad pick. Adds a sibling test alongside the existing project-carry case. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 22:31:17 +08:00
LinYushen	29082f7cfe	feat: implement Squad feature MVP (#2505 ) * feat: implement Squad feature MVP - Add migration 084_squad: squad, squad_member, squad_activity_log tables - Extend issue.assignee_type to support 'squad' - Add sqlc queries for squad CRUD, member management, activity logs - Add Go handler with full Squad API (CRUD, members, activity log) - Register routes: /api/squads/, /api/issues/{id}/squad-activity, /api/squad-activity - Add Squad trigger logic: - Assign Squad immediately triggers leader - Every external comment on squad-assigned issue triggers leader - Anti-loop: squad members' comments don't trigger leader - Dedup: skip if leader already has pending task - Add squad activity log API (方案 B) for leader no-op recording - Add frontend TypeScript types (Squad, SquadMember, SquadActivityLog) - Add protocol events: squad:created, squad:updated, squad:deleted Co-authored-by: multica-agent <github@multica.ai> fix: address PR review blocking issues 1. validateAssigneePair now accepts 'squad' assignee_type 2. All squad endpoints validate workspace ownership via GetSquadInWorkspace 3. CreateSquadActivityLog restricted to squad leader agent only 4. AddSquadMember validates member exists in workspace 5. UpdateSquad auto-adds new leader to squad members 6. DeleteSquad transfers assigned issues to leader before deletion 7. IssueAssigneeType includes 'squad' in frontend types Co-authored-by: multica-agent <github@multica.ai> * feat: soft-delete squads via archive instead of hard delete - Add migration 085: archived_at + archived_by columns on squad table - ListSquads now excludes archived squads (ListAllSquads for admin) - DeleteSquad → ArchiveSquad (sets archived_at, preserves all records) - Transfer squad-assigned issues to leader before archiving - SquadResponse includes archived_at/archived_by fields - Frontend Squad type updated with nullable archived fields Co-authored-by: multica-agent <github@multica.ai> * feat: re-add Squads frontend entry (sidebar nav + pages) Re-applies the frontend squad entry that was lost during a merge: - Sidebar nav: Squads item with Users icon - Paths: squads() and squadDetail() in workspace paths - Routes: /squads and /squads/[id] pages - Views: SquadsPage (list) and SquadDetailPage - i18n: en 'Squads' / zh '小队' - Reserved slug: 'squads' Co-authored-by: multica-agent <github@multica.ai> * fix: fix SquadsPage rendering - use PageHeader children pattern PageHeader takes children, not title/actions props. The incorrect usage caused a React rendering error. Now matches the pattern used by autopilots and agents pages. Co-authored-by: multica-agent <github@multica.ai> * fix(squads): add API client methods and package export for squads pages * feat: complete Squad frontend - create dialog, member management, API methods - Add CreateSquadModal with name/description/leader selection - Register 'create-squad' in modal registry - Wire 'New Squad' button to open the modal - Add full API client methods: createSquad, updateSquad, deleteSquad, addSquadMember, removeSquadMember - Rewrite SquadDetailPage with: - Member list showing resolved names - Add/remove member UI - Archive squad button - Back navigation to squads list Co-authored-by: multica-agent <github@multica.ai> * feat: improve Squad UI - match create agent dialog style - CreateSquadModal: proper Dialog with Header/Description/Footer, agent picker with avatars, textarea for description - SquadDetailPage: centered max-w-2xl layout, ActorAvatar for members, Crown badge for leader, textarea for member description, improved spacing and visual hierarchy - Renamed 'role' field label to 'Description' in add member form (describes the member's responsibilities in the squad) Co-authored-by: multica-agent <github@multica.ai> * feat(squad): add avatar, instructions; drop unique-name constraint - 086: add squad.avatar_url - 087: drop unique constraint on squad.name (squads with the same name are legitimate across teams; uniqueness was an accidental product constraint) - 088: add squad.instructions (text, default '') - UpdateSquad now COALESCEs avatar_url + instructions - handler exposes Instructions in SquadResponse and accepts it in UpdateSquad Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(squad): assignable + mention target; trigger leader on assign - assignee picker and @mention suggestion list squads alongside agents and members; renders squad avatar/icon - creating or updating an issue with assignee_type=squad enqueues a task for the squad's current leader (mirrors agent-assignee parking-lot rule: skip backlog only) - workspace queries/hooks expose squads where needed for the pickers - locales updated for new picker copy Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(squad): agent-style detail page with members + instructions tabs - restructure squad detail page to mirror the agent detail page: 320px inspector (creator, leader, created/updated) + tabbed pane (Members \| Instructions) with dirty-guard AlertDialog - inline name + avatar editing on the inspector - inline description editor (modal textarea) - members tab: leader + member picker with role descriptions, swap leader, edit member roles, remove - instructions tab: ContentEditor + Save (mirrors agent pattern) - squads list shows the squad avatar/icon - core types + api.updateSquad accept avatar_url + instructions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(squad): inject leader briefing on claim (protocol + roster + instructions) When a squad's leader agent claims a task on a squad-assigned issue, append a system-level briefing to the agent's Instructions composed of: 1. Squad Operating Protocol — hard-coded rules: leader is a coordinator, dispatch via @mention, stop after dispatching, resume on re-trigger, do not work outside the roster. 2. Squad Roster — leader self-row plus one row per non-archived member with a literal mention markdown string ([@Name](mention:// agent\|member/<UUID>)) the leader can paste verbatim. Round-trips through util.ParseMentions, enforced by a contract test. 3. Squad Instructions — the user-defined squad.instructions block, omitted entirely when empty so we do not leave a dangling heading. Non-leader members claiming the same issue receive no briefing. Tests cover: full squad with mixed agent/human members, lone leader, archived agents skipped, empty user instructions, mention round-trip, and the leader/non-leader claim-handler gate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(squad): tell leader not to restate issue context in dispatch comment After observing leaders padding their delegation comments with full re-summaries of the issue body and prior discussion, make the Operating Protocol explicit: - assignees on Multica already have the full issue (title, description, all comments, attachments) and workspace context; - delegation comments should add only what cannot be inferred (who is picked, why, extra constraints), aim for two or three sentences; - restating context is now an explicit hard rule violation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(squad): unify leader evaluation into activity_log, add CLI command - Squad member comments now trigger leader (only leader self-excluded) - Replace squad_activity_log with activity_log (action: squad_leader_evaluated) - Add CLI: multica squad activity <issue-id> <outcome> --reason - Add API: POST /api/issues/{id}/squad-evaluated - Update squad operating protocol to require evaluation recording - Remove squad_activity_log table from schema and generated code * feat(cli): add squad list, get, member list commands * fix(squad): address review findings (P1+P2) P1 fixes: - Add 'squads' to reserved_slugs.json (source of truth) - Add 'create-squad' to ModalType union - Remove unused leaderOpen/selectedLeader in create-squad modal - Replace literal JSX strings with i18n selectors (en + zh-Hans) P2 fixes: - Add 'squad' to mention regex (MentionRe) - Fix human member lookup in squad briefing (use GetUser directly) - Add squads routes to desktop app - Add squad:created/updated/deleted to WSEventType + invalidation - Reject archived squads as issue assignees * fix(squad): restore zh-Hans key, publish activity event, invalidate issues on archive - Restore create_project.title in zh-Hans modals.json (dropped by prior edit) - Publish activity:created WS event after squad leader evaluation - Invalidate issue queries on squad:deleted (archive transfers assignees) - Add creator info to squad list cards * fix(squad): realtime sync, rerun support, leader validation - Use workspaceKeys.squads prefix for detail/member queries (realtime invalidation) - Publish squad:updated after add/remove/role-change member mutations - Support rerun for squad-assigned issues (targets leader agent) - Reject assignment to squads whose leader is archived --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-13 18:46:20 +08:00
Naiyuan Qing	623d29f276	feat(agents): one-click create from curated templates (Phase 1) (#2520 ) * docs(agents): three-phase agent quick-create plan Captures the full design for moving agent creation from manual form + one-by-one skill attachment to a tiered experience: - Phase 1 (this PR): one-click curated templates, AI-free. - Phase 2 (next): AI-recommended skills via the existing quick-create task mechanism — no new server-side LLM dependency. - Phase 3 (later): AI creates the whole agent end-to-end, composing Phase 2 with a new `multica agent create` CLI driver. Documents the architectural decisions that keep all three phases on existing infrastructure (no SSE, no server-side LLM SDK, no new WS channels), the two soft blockers Phase 1 unlocks for later phases (createSkillWithFiles TX composability + skill same-name dedupe), and the scope decisions we explicitly opted out of (Anthropic plugin marketplace, ClawHub UI affordances). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(skills): harden import against invalid UTF-8 and binary files PG rejects two byte patterns in a TEXT column. Both crashed real skill imports we hit while assembling the template catalog: - Embedded NUL (0x00) -> SQLSTATE 22021. Already stripped by sanitizeNullBytes, kept as-is. - Other invalid UTF-8 (e.g. 0x91 — Windows-1252 smart quote in a skill whose author saved prose from Word). sanitizeNullBytes now also runs strings.ToValidUTF8 over the content so the second class no longer takes the whole import down. For non-text payloads (images, fonts, archives, compiled binaries), sanitization isn't the right fix — agents never read those as text, and the bytes can't survive a TEXT column at all. addFile now skips them by extension before the per-bundle cap counters tick, logging the skip so an unexpected drop leaves a breadcrumb. Function name kept for compatibility with the many call sites; both behaviours are strict supersets of the original. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(skills): split createSkillWithFiles for tx composition + add workspace find-or-create query Two soft blockers cleared so create-from-template (next commit) can fold N skill creates and the agent + binding writes into one outer transaction: 1. createSkillWithFiles used to Begin/Commit its own tx. Caller composition was impossible — N invocations meant N separate transactions and no atomicity over the whole materialise step. Pull the body into createSkillWithFilesInTx(ctx, qtx, input); the original function becomes a thin wrapper that manages its own tx for standalone callers. Existing call sites: zero behaviour change. 2. Add GetSkillByWorkspaceAndName sqlc query — workspace skill lookup by name, anchored to UNIQUE(workspace_id, name) from migration 008. Lets the template materialiser implement find-or-create: reuse the workspace's existing skill row when a template references the same name, rather than crashing on the unique constraint or polluting the workspace with `<name>-2` clones. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agents): agent template catalog + create-from-template endpoint Server-side foundation for Phase 1 of the quick-create roadmap (see docs/agent-quick-create-plan.md). Adds: - server/internal/agenttmpl/ — embed-loaded catalog of curated agent templates. Each template ships pre-written instructions plus a list of skill URLs that get materialised into the workspace at create time. Validation runs at startup (init() panics on a malformed template) so a bad JSON ships as a deploy-time defect, not a runtime 500. Slug must equal the filename basename so the URL router is mirror-symmetric with the file layout. - 11 starter templates covering Engineering / Writing / Building / Testing (code-reviewer, frontend-builder, planner, docs-writer, one-pager, html-slides, full-stack-engineer, …). - Three new endpoints, all behind RequireWorkspaceMember: GET /api/agent-templates — picker list (no instructions) GET /api/agent-templates/:slug — detail with instructions POST /api/agents/from-template — materialise + create Create flow: 1. Auth + runtime authorization happen BEFORE the GitHub fan-out so a 403 never wastes 20s of upstream fetches. 2. Pre-flight dedupe by cached_name reuses workspace skills without an HTTP fetch — second create-from-the-same-template drops from 20s to <100ms. 3. Parallel fetch (30s per-URL timeout) for the remaining skills. 4. Single transaction: every skill insert, the agent insert, and the agent_skill bindings. On any upstream fetch failure the TX rolls back and the API returns 422 with `failed_urls` so the UI can name the bad source(s). 5. extra_skill_ids (user-supplied additions) are verified through GetSkillInWorkspace per id before attach, so a malicious client can't graft a skill from another workspace via UUID guessing. - multica agent create --from-template <slug> CLI flag dispatches to the new endpoint with a 60s ceiling, matching `multica skill import`. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agents): one-click create-from-template UI Frontend half of Phase 1. CreateAgentDialog becomes a state machine spanning four steps: chooser → Start blank / From template cards blank-form → existing manual form (post-chooser) duplicate-form → existing form pre-filled from a duplicated agent template-picker → grid of templates, click navigates to detail template-detail → instructions + skill list preview + one-click Use Picking a template never lands on the form: name auto-deduped against existingAgentNames, runtime = first usable one, visibility = private. Refinement happens on the agent detail page if needed. Same rationale the doc spells out — templates exist precisely to skip configuration. New components, all collapsible-by-default so quick-create stays fast: - template-picker.tsx — categorised grid, lucide icons + semantic accent tokens resolved through static maps so Tailwind's JIT picks up every variant (dynamic class strings would silently miss). - template-detail.tsx — instructions preview, skill list with cached descriptions, Use CTA. Renders the failedURLs banner when a 422 fires — the only step that can trigger that response. - instructions-editor.tsx — collapsed preview-card / expanded full ContentEditor. - skill-multi-select.tsx + skill-picker-list.tsx — shared multi- select surface, also adopted by the existing skill-add-dialog. - avatar-picker.tsx — agent avatar upload, mirrors the inspector's visual language. Schema-defended client (CLAUDE.md → API Response Compatibility): the three new endpoints are wired through parseWithFallback with lenient zod schemas. Desktop builds outlive any given server — a future field rename / wrapping must not white-screen older installs. listAgentTemplates accepts both the current bare array and a future {templates: [...]} envelope. Coverage: 7 new schema-test cases in schema.test.ts (null body, missing skills/instructions, malformed create response, envelope migration). Catalog + detail go through TanStack Query with staleTime: Infinity — workspace-independent static data, no per-mount refetch. Other: - skill-add-dialog becomes a true multi-select (Confirm button + checkbox list); attached skills are filtered out of the list. - agents-page hands the freshly-created Agent back to the dialog so a follow-up setAgentSkills can attach the form-selected skills. - agent-overview-pane drops the mx-auto/max-w-2xl frame on config- tab content; the wider dialog visual language reads better with tabs filling the column. - Every new UI string lives in both en/agents.json and zh-Hans/agents.json under create_dialog.* / tab_body.skills.* — locales/parity.test.ts blocks drift in CI. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): align skill import test + drop next-only lint suppression - TestFetchFromSkillsSh_ResolvesRootLevelSkillMd now expects assets/logo.png to be skipped; matches the new addFile binary-extension guard (`6fafd86e`). The .png is intentionally dropped so PG TEXT inserts don't hit SQLSTATE 22021. - packages/views shares zero next/* deps, so the @next/next/no-img-element eslint plugin isn't loaded there. The eslint-disable directive referencing it produced a hard "rule not found" error in CI lint. Raw <img> is the right primitive in views; remove the disable comment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(agents): wrap CreateAgentDialog tests in workspace/navigation providers The dialog now calls useNavigation() and useWorkspacePaths(), both of which throw outside their providers. The existing tests rendered the dialog bare and tripped both new requirements: - NavigationProvider — supply a stub adapter so push() works for the agent-detail redirect. - WorkspaceSlugProvider — useWorkspacePaths() requires a slug. The blank-vs-template chooser is now the default first step; the existing tests target the runtime picker on the manual form, so the helper auto-clicks "Start blank" when no template is passed (duplicate-mode tests skip the chooser). Manual afterEach(cleanup) + document.body wipe. Base UI's Dialog portal renders into document.body and leaves focus-guard/inert wrapper divs behind across tests, so the second test in the suite saw two "All" / "My Runtime" matches and getByText failed. The wipe is local to this file rather than the shared setup because it isn't a global issue — only suites that open Base UI dialogs hit it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 18:26:04 +08:00
Naiyuan Qing	454c8e3d1a	feat: in-app preview for non-image attachments (#2528 ) * feat(storage): add GetReader to Storage interface Adds a streaming read method to the Storage abstraction so callers can pull object bytes without forcing a full in-memory load. S3Storage wraps GetObject; LocalStorage opens the file with path-traversal and sidecar guards. Tests cover happy path, traversal rejection, sidecar rejection, and missing key. Used in the next commit by the attachment-preview proxy endpoint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(server): add attachment preview proxy endpoint GET /api/attachments/{id}/content streams the raw bytes of a text-previewable attachment back to the client. Exists to (a) bypass CloudFront CORS, which is not configured on the CDN, and (b) bypass Content-Disposition: attachment which Chromium honors for iframe document loads. Media types (image/video/audio/pdf) intentionally do NOT go through this endpoint — clients render them directly from the signed CloudFront download_url, which is already served with Content-Disposition: inline. Hard cap: 2 MB. Larger files return 413. Anything outside the text whitelist returns 415. The whitelist (isTextPreviewable) mirrors the client-side dispatcher; the cross-reference comment in file.go flags the manual sync until a JSON SSOT generator lands. Response always uses Content-Type: text/plain; charset=utf-8 so a hostile HTML payload can't be re-interpreted as a document. The original MIME ships via X-Original-Content-Type for client dispatch. Cache-Control: no-store so revoked attachment access takes effect immediately on the next request. Tests cover happy path (md), extension fallback when content_type is generic, 415 (pdf), 413 (>2MB), foreign workspace (404 isolation), and the isTextPreviewable table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(core/api): add getAttachmentTextContent + preview error types Adds an ApiClient method that fetches the text body of an attachment via the new /api/attachments/{id}/content proxy. Two typed errors — PreviewTooLargeError (413) and PreviewUnsupportedError (415) — let the preview modal render specific fallbacks instead of a generic failure. Refactors the private fetch() into a shared fetchRaw() helper so the new method inherits the standard infra: auth headers, 401 → handleUnauthorized recovery, X-Request-ID, error logging, and the ApiError contract. The previous draft bypassed all of these by calling window.fetch directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(views/editor): add AttachmentPreviewModal + Eye entry points In-app preview for non-image attachments. An Eye icon now sits next to the existing Download button on file cards / readonly file cards / the standalone AttachmentList. Clicking it opens a full-screen modal that dispatches by content_type: pdf: <iframe src={download_url}> — Chromium PDFium video/: <video controls src={download_url}> — native controls audio/: <audio controls src={download_url}> — native controls md: <ReadonlyContent> — full markdown pipeline html: <iframe srcdoc sandbox=""> — fully restricted text: <code class="hljs"> — lowlight highlight Media types render directly from the signed CloudFront download_url (server marks them inline-disposition). Text types fetch through the new /api/attachments/{id}/content proxy via TanStack Query, wrapped in useAttachmentPreview() so each entry point owns its own modal state without depending on a global Provider mount. Modal sizing: max-w-6xl × min(90vh, 100vh - 2rem) — slightly larger than create-issue's max-w-4xl since PDF / video need room, but capped to viewport on small screens. Sub-renderers use h-full to follow the fixed modal height instead of viewport-relative units. Images are intentionally NOT touched — the existing ImageLightbox (extensions/image-view.tsx) already handles them correctly. The new modal would be churn without user-visible benefit. Adds i18n keys under attachment.* (en + zh-Hans) and registers Preview/Download/Upload in the conventions glossary so future translations stay consistent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(desktop): enable Chromium PDF viewer for attachment preview Adds webPreferences.plugins: true to the main BrowserWindow so the bundled Chromium PDFium plugin activates inside iframes — required for the attachment preview modal's PDF dispatch. Default is false in Electron; without it <iframe src=*.pdf> renders blank. Security trade-off, accepted intentionally and documented inline: 1. This window already runs with webSecurity: false + sandbox: false, so plugins: true does NOT meaningfully widen the renderer's attack surface beyond what is already accepted. 2. The only PDFs that reach an iframe here are signed CloudFront URLs we ourselves issued; user-supplied URLs are routed through setWindowOpenHandler → openExternalSafely and cannot land in this renderer. 3. Chromium's PDFium plugin is itself sandboxed and only handles application/pdf — no Flash/Java/other historical plugin surfaces. If we ever tighten webSecurity / sandbox, the follow-up is to host the PDF viewer in a dedicated BrowserView with plugins scoped to that view, keeping the main renderer plugin-free. Old desktop builds ship without the preview modal, so the Eye button never appears and PDF preview is gated by the same release — zero regression risk for users on stale clients. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 18:24:15 +08:00
Bohan Jiang	5db96b4007	fix(daemon): bypass Gemini folder-trust gate in headless mode (#2516 ) (#2523 ) Gemini CLI's folder-trust feature throws FatalUntrustedWorkspaceError (exit code 55) when the current workspace isn't in `~/.gemini/trustedFolders.json` and the process is headless — no interactive trust prompt is available. The daemon spawns gemini with `-p` + `--yolo` in a freshly checked-out worktree that the user has never trusted interactively, so every run with `security.folderTrust` enabled fails after ~10s with exit status 55 and no useful output. Default `GEMINI_CLI_TRUST_WORKSPACE=true` on the child env to short- circuit `checkPathTrust` in gemini-core. This mirrors gemini-cli's documented `--skip-trust` flag; the env var has been gemini's documented headless escape hatch for the entire folder-trust feature lifetime so the fix works on every gemini version that can produce the crash. Callers that explicitly set the same key in cfg.Env win, preserving the ability to opt back into the gate. Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 17:05:12 +08:00
Bohan Jiang	178cfb5008	fix(daemon): strip Windows chcp noise from runtime version (#2516 ) (#2521 ) The gemini CLI's Windows shim emits `Active code page: 65001` (from `chcp`) to stdout before the real version reaches `--version` output. The daemon stored the raw concatenation as the runtime version, so the runtime detail page rendered `Active code page: 65001 0.42.0` instead of `0.42.0`. Scan `<cli> --version` line by line and return the first line carrying a semver-shaped token. Full strings like `2.1.5 (Claude Code)` or `codex-cli 0.118.0` survive unchanged; unparseable output falls back to the trimmed raw value. Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 16:58:14 +08:00
Bohan Jiang	51aa924124	feat(chat): support renaming chat sessions inline (#2522 ) Adds a pencil icon next to the trash icon on each session row in the chat dropdown. Clicking it turns the title into an inline editable input: Enter / blur saves, Escape cancels. Server: new PATCH /api/chat/sessions/{id} handler that updates the title via the existing `UpdateChatSessionTitle` sqlc query, broadcasts a new `chat:session_updated` WS event so other tabs / devices stay in sync, and rejects blank titles. Frontend mutation is optimistic with rollback, matching the existing delete-session pattern. MUL-2110 Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 16:57:34 +08:00
Bohan Jiang	384ddcbe65	fix(execenv): seed user-installed Codex skills into per-task CODEX_HOME MUL-1626 (#2519 ) * fix(execenv): seed user-installed Codex skills into per-task CODEX_HOME Codex is the only daemon runtime whose HOME is redirected — the daemon sets CODEX_HOME to a per-task isolated directory so each task gets a clean config slate without polluting ~/.codex/. Side effect: the codex CLI never sees the user's `~/.codex/skills/` and tells the user no skill was found. Other runtimes (claude / copilot / opencode / pi / cursor / kimi / kiro) don't have this issue: they leave HOME untouched and discover both user-level skills (from ~/.<runtime>/skills) and workspace-assigned skills (written to a workdir-local dotfile dir) natively. Codex is the outlier. Fix: in execenv.Prepare and execenv.Reuse, copy each subdirectory under `~/.codex/skills/` into the per-task `codex-home/skills/` before writing workspace-assigned skills. Workspace skills still win on sanitized-name conflict; user-level installer symlinks (lark-cli style) are followed so the per-task home gets real content rather than dangling links. Closes #1922 Co-authored-by: multica-agent <github@multica.ai> * fix(execenv): wipe per-task codex skills dir before each hydration Without this, the Reuse path leaves two classes of stale state behind: 1. Round 1 seeded user skill `writing/drafts/stale.md`. Round 2 reuses the same workdir with workspace skill `Writing` assigned: seed stage skips user `writing` (reserved), workspace stage writes `SKILL.md` via MkdirAll + WriteFile but never clears the directory, so the round-1 user support files surface under the workspace skill — violating "workspace fully wins on name conflict" and potentially leaking user-level files into a workspace skill view. 2. User uninstalls a skill from ~/.codex/skills between two runs. The prior copy in codex-home/skills/<name>/ lingers, so the codex CLI keeps seeing the removed skill. Fix: RemoveAll(codex-home/skills) at the start of hydrateCodexSkills, then re-seed user skills and re-write workspace skills. On Prepare this is a no-op (envRoot was already wiped); on Reuse it resets the slate. Added two regression tests covering both scenarios. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 16:35:03 +08:00
Naiyuan Qing	e8c2855746	fix(chat): collapse chat-done flicker via inline cache write (#2509 ) * fix(chat): collapse chat-done flicker via inline cache write The chat panel flickered at end-of-turn: live TimelineView unmounted → short blank + scroll jump → persistent AssistantMessage finally appeared. Root cause: chat:done's WS handler called setQueryData(pendingTask, {}) synchronously while invalidateQueries(messages) was an async refetch. The render guard pendingAlreadyPersisted (chat-message-list.tsx:62-68) expected the persisted message to already be in the messages cache before pending cleared, but the sync/async ordering broke that guard. Fix follows TkDodo's "combine setQueryData (active query) + invalidate (others)" pattern. ChatDonePayload now carries the freshly-persisted ChatMessage (id, content, elapsed_ms, created_at); the WS handler writes it into chatKeys.messages BEFORE clearing pending. Same render tick → AssistantMessage mounts before TimelineView unmounts → no flicker. invalidate(messages) stays as a fallback for clients that took the older code path or for content drift (redaction, etc.). Also slim task:completed's chat branch — chat:done already wrote the message and cleared pending; task:completed only refreshes the cross-session pending aggregate that drives the FAB. Field additions are all `omitempty` / TS `?:` so older clients ignore them and older servers (no fields populated) fall back to invalidate- only, preserving prior behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(chat): cover chat done cache handoff Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Eve <eve@multica-ai.local>	2026-05-13 15:27:44 +08:00
Bohan Jiang	451c46c43f	refactor(usage): rename Dashboard → Usage + dynamic per-agent leaderboard (#2511 ) The page added in #2462 lived at `/{slug}/dashboard` and was titled "Dashboard", which collides with the conventional meaning ("personal landing surface") and doesn't tell new users what the page is for. Its actual contents — token spend, cost, run time, task counts — map cleanly onto the OpenAI / Anthropic / Vercel "Usage" surface, so rename to that. Renames (user-visible) - Route: `/{slug}/dashboard` → `/{slug}/usage` (web App Router + desktop memory router) - Sidebar entry: label "Dashboard" / "看板" → "Usage" / "用量", icon LayoutDashboard → BarChart3 (page header icon swapped in sync) - Page title in en/zh-Hans - Reserved-slugs: add `usage` to workspace route segments group; `dashboard` stays reserved in the marketing group (back-compat against workspace slug collisions + keeps the name free for a future Home page) - i18n namespace `dashboard` → `usage` across resources-types.ts, locales/index.ts, and the moved JSON files - WORKSPACE_ROUTE_SEGMENTS in editor link-handler - paths.workspace(slug).dashboard() → .usage(), with matching test expectation updates Per-agent leaderboard polish (`packages/views/dashboard/components/ dashboard-page.tsx`) - Card title "Cost & run time by agent" → "Leaderboard" with a 4-way Segmented control: Tokens / Cost / Time / Tasks - Active metric drives row order, progress-bar width, and the emphasised column header / cell — keeping ranking, visual quantity, and column emphasis in lockstep so users always see what's being measured - Default sort = Tokens (most universally meaningful; Cost still one click away) - Project filter dropdown: - Show ProjectIcon next to the selected project + each list item; FolderKanban as the "All projects" fallback (matches ProjectPicker language) - alignItemWithTrigger={false} so "All projects" doesn't get pushed above the trigger and clipped when the header sits at the top of the viewport (was the root cause of "can't re-select All projects" once a project was selected) - max-h-72 to cap the dropdown when workspaces accrue many projects; matches the runtime-detail Select precedent - Folder name `packages/views/dashboard/*` and `DashboardPage` component name intentionally left in place — user-visible rename only, no broad code refactor. Old `/dashboard` routes are not redirected because the page only landed in #2462 (a few days ago); no real users, external links, or desktop-tab persistence have settled on it yet.	2026-05-13 14:07:53 +08:00
Multica Eve	ff27142b69	fix: treat empty output on successful completion as completed, not blocked (#2507 ) When an agent completes successfully (exit 0) but produces no text output, the daemon incorrectly classified it as 'blocked'. This is wrong — agents can legitimately complete work via tool calls (posting comments, pushing code) without emitting text output. Change the empty-output path to return status=completed so the task is correctly reported as successful. Fixes MUL-2104 Co-authored-by: yushen <ldnvnbl@gmail.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 12:56:17 +08:00
Bohan Jiang	96695a79c5	feat(dashboard): workspace/project token + run-time dashboard MUL-1882 (#2462 ) * feat(dashboard): workspace/project token + run-time dashboard Add a `/{slug}/dashboard` page showing per-agent token spend and execution time across the whole workspace, with an optional project filter. Backend: - Three new sqlc queries against task_usage + agent_task_queue: daily usage, per-agent usage, per-agent total run-time. All optionally scoped to a project via sqlc.narg('project_id'), reaching project through the issue join. - Handlers under /api/dashboard return the same wire shape the runtime page already consumes (model preserved for client-side cost math). Frontend: - Shared DashboardPage in packages/views/dashboard reusing KpiCard, DailyCostChart, ActorAvatar, and estimateCost from the runtime page so the visual style and pricing math stay in lock-step. - Period selector (7/30/90d), project dropdown, four KPI tiles (cost, tokens, run time, tasks), daily cost chart, and a combined "cost + run time by agent" list. - Routed in both web (app/[slug]/(dashboard)/dashboard) and desktop (memory router); sidebar nav entry added under Workspace group. Co-authored-by: multica-agent <github@multica.ai> * fix(dashboard): drop stale project filter and stop double-counting tasks Two issues caught in PR #2462 review: 1. Project filter held the previous selection's UUID across workspace switches and project deletions: the dropdown gracefully showed "All projects" (because the title lookup missed) while the three dashboard queries kept forwarding the dead UUID, leaving the UI looking like a full-workspace view but populated with empty project-scoped data. Validate the picked UUID against the current projects list before passing it to the queries. 2. The "by agent" table read its task count from the token rollup, which is grouped per (agent, model). A single task that spans two models lands twice and the agent's row reads e.g. "2 tasks" when the real count is 1. Prefer `ListDashboardAgentRunTime`'s per-agent distinct count when available; fall back to the token aggregate only for agents with no terminal run yet (in-flight tasks). Extract the merge into `mergeAgentDashboardRows` so the precedence rules are unit-tested directly. Co-authored-by: multica-agent <github@multica.ai> * test(dashboard): allocate per-workspace issue.number explicitly TestDashboardEndpoints creates two issues in the shared fixture workspace. issue.number defaults to 0 (migration 020), and the table carries UNIQUE (workspace_id, number), so the second insert raced the first on the same default and failed in CI. Allocate MAX(number) + 1 per insert so each row gets a fresh number without stepping on rows other tests left behind in the same workspace. Co-authored-by: multica-agent <github@multica.ai> * feat(dashboard): rollup table + cron-driven aggregation for dashboard Mirror the per-runtime rollup in `task_usage_daily` (migrations 073/077/082) to remove the per-request raw aggregation the dashboard was doing. Migration 084 adds: - `task_usage_dashboard_daily` keyed on (bucket_date, workspace_id, agent_id, project_id, model) — the dimensions the dashboard actually queries, with project_id nullable via UNIQUE NULLS NOT DISTINCT (PG15+) so "no-project" buckets upsert cleanly. - `task_usage_dashboard_rollup_state` watermark table. - `task_usage_dashboard_dirty` invalidation queue. - Triggers on agent_task_queue DELETE, task_usage DELETE, and issue.project_id UPDATE — the cases the updated_at watermark can't see. The project_id trigger re-attributes existing rollup rows when a user moves an issue across projects. - `rollup_task_usage_dashboard_daily_window(from, to)` — idempotent recompute primitive (same shape as 077). - `rollup_task_usage_dashboard_daily()` cron entry — own advisory lock (4244) so it serialises independently of the runtime rollup. - `task_usage_dashboard_rollup_lag_seconds()` health helper. Sqlc queries `ListDashboardUsageDailyRollup` / `ListDashboardUsageByAgentRollup` read from the new table; the handler dispatches between rollup and raw on a separate `UseDailyRollupForDashboard` config flag (`USAGE_DASHBOARD_ROLLUP_ENABLED` env). Same fail-safe default (false → raw) so operators can roll out independently of the per-runtime flag. Bucket date is UTC (the dashboard aggregates across runtimes that may sit in different tzs; there's no single correct local boundary). Adds `cmd/backfill_task_usage_dashboard_daily` mirroring the existing per-runtime backfill — operator runs it once before flipping the flag. Tests: - TestDashboardEndpoints now also exercises the rollup read path (raw vs. rollup, same project-scoped totals). - TestDashboardRollupReattributesOnProjectChange verifies the issue.project_id trigger enqueues both old + new buckets and the next rollup tick zeroes the old project + populates the new one. Co-authored-by: multica-agent <github@multica.ai> * fix(dashboard-rollup): close two invalidation gaps Two leak paths missed by migration 084 review: 1. Issue cascade DELETE — the atq BEFORE DELETE trigger runs AFTER the issue row is gone, so `LEFT JOIN issue` returns NULL project_id and the original-project bucket never gets cleared (issue 077 calls this out for the runtime rollup but didn't need to act on it). Adds an `issue BEFORE DELETE` trigger that enqueues using OLD.project_id while the issue row is still readable. 2. `LinkTaskToIssue` (quick-create task attaching to a real issue post- completion) UPDATEs `agent_task_queue.issue_id` from NULL to a real id. Migration 084 only watched DELETE on atq, so usage already rolled up under the no-project bucket stayed attributed to NULL forever. Extends the atq trigger to fire on UPDATE OF issue_id too, enqueueing both OLD (NULL project) and NEW (linked issue's project). Tests: - TestDashboardRollupClearsOnIssueDelete asserts rollup row drops to zero after issue delete + rollup tick. - TestDashboardRollupReattributesOnLinkTaskToIssue verifies tokens move from the NULL bucket to the project bucket after the UPDATE. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 12:51:16 +08:00
Bohan Jiang	a02e58b488	fix(github): only auto-close issue after all linked PRs resolve (#2470 ) * fix(github): only auto-close issue when all linked PRs have resolved Previously, the webhook handler unconditionally moved an issue to `done` as soon as a single linked PR was merged. If a second PR was also linked to the same issue and still open / draft, the issue would close before the work was actually finished. Add `CountOpenSiblingPullRequestsForIssue` and gate the auto-status transition on it: a merged PR advances its linked issues only when no sibling PR linked to the same issue is still in flight. Issues stay put while siblings are open or draft, and the merge that resolves the last in-flight PR is the one that closes the issue. Adds an integration test that opens two PRs against the same issue, merges the first, asserts the issue stays in_progress, then merges the second and asserts the issue advances to done. Co-authored-by: multica-agent <github@multica.ai> * fix(github): re-evaluate auto-close on closed-without-merge events too GPT-Boy review on #2470: gating only the `state == "merged"` branch left one ordering hole. PR-A merges first → issue stays in_progress because PR-B is open; PR-B later closes WITHOUT merging → no event ever re-runs the auto-close check, so the issue is stuck in_progress. Generalise the trigger to every terminal PR event (`merged` or `closed`) and advance the issue only when: - the issue is not already terminal (done / cancelled); - no sibling PR is still in flight (open / draft); - at least one linked PR — current or sibling — actually merged. Rule (3) preserves "user closed every PR without merging → leave the issue alone": if no work was delivered, the user decides what to do. Replace `CountOpenSiblingPullRequestsForIssue` with `GetSiblingPullRequestStateCountsForIssue`, which returns both the in-flight count and the merged count in a single roundtrip. Adds `TestWebhook_ClosedSiblingAfterMerge` (the regression GPT-Boy flagged) and `TestWebhook_AllClosedWithoutMerge` (the negative case guarding rule 3). Refactors the multi-PR webhook helper out of the existing two-merge test so all three multi-PR scenarios share it. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-12 15:39:55 +08:00
Bohan Jiang	caeb146bac	feat(github): GitHub App integration for PR ↔ issue linking (#1817 ) * feat(github): GitHub App backend for PR ↔ issue linking - New tables: github_installation (workspace ↔ App install), github_pull_request (mirrored PR state), issue_pull_request (M:N link). - Webhook handler verifies HMAC-SHA256, upserts PR rows, parses issue identifiers from PR title/body/branch and auto-links them. Merging a linked PR moves the issue to done. - Connect/setup endpoints power the zero-config "Connect GitHub" install flow; state token is HMAC-signed so the setup callback can recover the workspace. - Workspace-scoped admin routes for listing/disconnecting installations, plus a per-issue `pull-requests` list endpoint. Co-authored-by: multica-agent <github@multica.ai> * feat(github): UI for connecting GitHub and viewing linked PRs - Settings → Integrations: new tab with Connect GitHub / installations list / disconnect, gated on the deployment having the App configured. - Issue detail sidebar: Pull requests section showing linked PR title, repo, state (open/draft/merged/closed), and author, with deep link to GitHub. - Real-time refresh: github_installation:* and pull_request:* events invalidate the matching TanStack Query caches. Co-authored-by: multica-agent <github@multica.ai> * fix(github): address review — null actor, role gating, configured guard, scoped uninstall broadcast - listeners: use optionalUUID(e.ActorID) so the system actor on the github-driven issue:updated event no longer panics activity / notification listeners; merged-PR → issue done now produces a status_changed activity and inbox entry. - IntegrationsTab: gate the admin-only installations query on canManage so members no longer hit /github/installations 403; the configured/not-configured copy is also scoped to admins. - backend: introduce isGitHubConfigured() requiring both GITHUB_APP_SLUG and GITHUB_WEBHOOK_SECRET, and surface that single flag from list-installations + connect endpoints so the frontend Connect button stays disabled until both are set. - DeleteGitHubInstallationByInstallationID now RETURNs workspace_id; webhook handler publishes github_installation:deleted scoped to the right workspace so already-open Settings tabs invalidate in real time. ErrNoRows on a re-fired delete short-circuits cleanly. - tests: focused webhook integration coverage (auto-link + merge → done, cancelled preservation, uninstall returns workspace). Co-authored-by: multica-agent <github@multica.ai> * fix(github): i18n the new GitHub UI strings to satisfy lint CI flagged every literal string in the Integrations tab, the Pull requests sidebar section, and the per-PR row label. Move them through useT() and add the matching `integrations.` block to settings.json (en / zh-Hans) plus `detail.section_pull_requests` / `detail.pull_request_state_` / loading + empty copy under `issues.json`. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-12 13:49:03 +08:00
Bohan Jiang	f08b2b4f50	fix(attachments): harden local sidecar serving and tighten Upload gate (#2459 ) Follow-ups to #2444: - ServeFile refuses keys ending in .meta.json so the sidecar JSON isn't a stable read API. Sits before any disk work so a crafted .meta.json sibling can't trigger an out-of-tree read. - ServeFile rejects paths that resolve outside uploadDir (via filepath.Rel) before readLocalMeta runs. http.ServeFile's own .. guard fires later on r.URL.Path, but readLocalMeta would otherwise do a stray disk read on <some-path>.meta.json before the 400 lands. - Upload only writes a sidecar when filename is non-empty. ServeFile only reads the filename anyway, so a content-type-only sidecar was dead disk weight. - Drop the dead json.Marshal error branch — marshaling two strings cannot fail. Three new tests cover sidecar suffix rejection, the traversal guard, and the no-filename Upload short-circuit. Co-authored-by: multica-agent <github@multica.ai>	2026-05-12 12:49:22 +08:00
Truffle	91bdec9a54	fix(attachments): preserve original filename on /uploads/* downloads (#2444 ) LocalStorage.ServeFile delegated straight to http.ServeFile without setting Content-Disposition, so downloads of local-storage attachments landed on disk under the UUID-based storage key instead of the human filename the uploader had chosen. The S3 backend already sets Content-Disposition on PutObject (s3.go:186-187), so the local backend was the only one losing the original filename — a sibling asymmetry that's been there since multi-backend support landed. Upload now writes a sidecar <key>.meta.json beside the data file capturing the original filename and sniffed content type. ServeFile reads the sidecar when present and sets Content-Disposition using the existing sanitizeFilename + isInlineContentType helpers, mirroring the S3 inline/attachment decision exactly. Uploads from before this lands have no sidecar and fall through to the previous behavior. Delete now removes the sidecar alongside the data file so the upload directory doesn't grow orphans. Closes #2442	2026-05-12 12:37:07 +08:00
Naiyuan Qing	86aa5199fc	feat(chat): support attachments & images in chat input (#2445 ) * docs(plans): chat attachment & image support implementation plan Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(db): add chat_session_id/chat_message_id to attachment Co-authored-by: multica-agent <github@multica.ai> * feat(db): sqlc — chat_session_id on CreateAttachment + LinkAttachmentsToChatMessage Co-authored-by: multica-agent <github@multica.ai> * feat(file): upload-file accepts chat_session_id form field Co-authored-by: multica-agent <github@multica.ai> * feat(chat): SendChatMessage links uploaded attachments to the new message Co-authored-by: multica-agent <github@multica.ai> * feat(api): uploadFile accepts chatSessionId; sendChatMessage accepts attachmentIds Co-authored-by: multica-agent <github@multica.ai> * feat(core): useFileUpload supports chatSessionId context Co-authored-by: multica-agent <github@multica.ai> * feat(chat): support paste/drag/upload attachments in chat input Co-authored-by: multica-agent <github@multica.ai> * test(e2e): chat input attachment upload + send round-trip Co-authored-by: multica-agent <github@multica.ai> * chore(chat): keep lazy-created session title empty so untitled fallback localizes Co-authored-by: multica-agent <github@multica.ai> * fix(chat): address review — dedupe ensureSession + parse upload response - chat-window: cache in-flight createSession promise in a ref so a file drop followed by a quick send no longer spawns two sessions (and orphans the attachment on the losing one). - Attachment type + EMPTY_ATTACHMENT + AttachmentResponseSchema: include the new chat_session_id / chat_message_id fields the server now returns. - uploadFile: route the response through parseWithFallback so a malformed body returns EMPTY_ATTACHMENT instead of an undefined-keyed Attachment, matching the API boundary rule. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(chat): address PR #2445 review — test ctx, send gating, attachment surface 1. Backend test was 400ing because the handler reads workspace from middleware-injected ctx, and `newRequest` only sets the header. Helper `withChatTestWorkspaceCtx` mirrors the agent-access-test pattern and loads the member row + SetMemberContext before invoking the handler. 2. Attachment metadata now flows end-to-end: - new sqlc `ListAttachmentsByChatMessageIDs` (batch lookup, mirrors the comment-side query) - `chatMessageToResponse` takes `attachments` and `ChatMessageResponse` surfaces them — same shape as CommentResponse - `ListChatMessages` loads them via a new `groupChatMessageAttachments` helper so the chat bubble can render file cards - daemon claim path pulls `ListAttachmentsByChatMessage` for the latest user message and ships `ChatMessageAttachments` to the daemon - `buildChatPrompt` lists id+filename+content_type and instructs the agent to `multica attachment download <id>` — fixes the private-CDN expiring-URL problem where the markdown URL would have expired by the time the agent acts - TS `ChatMessage` gains an optional `attachments` field 3. Chat composer now blocks send while uploads are in flight: - `pendingUploads` counter increments in handleUpload, SubmitButton uses it to disable - handleSend also gates on `editorRef.current.hasActiveUploads()` to catch the Mod+Enter path that bypasses the button - new vitest covers the "drop large file → immediate send" scenario where attachment id would otherwise be silently dropped Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * chore: drop implementation plan doc Process artefact, not something the repo needs to keep. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-12 10:57:54 +08:00
Bohan Jiang	63d215e1c3	feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent (#2419 ) * feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent Closes the hole where a plain workspace member could pick another member's runtime in the Create Agent dialog and bind an agent to it — the backend wasn't checking runtime ownership, so the agent ran on someone else's hardware / tokens. Reported on GH #1804. Schema - Migration 083 adds agent_runtime.visibility ('private' default, 'public') with a CHECK constraint. Existing rows default to private — same ownership semantics as before, no behavior change for legacy data. Backend - canUseRuntimeForAgent predicate: allow when caller is workspace owner/admin, the runtime owner, or the runtime is public. - CreateAgent and UpdateAgent both gate on it: UpdateAgent matters because a plain member could otherwise create on their own runtime, then re-bind to a private one. - PATCH /api/runtimes/:id accepts { visibility } — owner/admin only, validated against the same private/public allow-list. Frontend - Create-agent dialog renders other-owned private runtimes disabled with a Lock badge + tooltip explaining who to ask. - Inspector runtime-picker disables the same set so re-binding fails the same way at the UI layer. - Runtime detail diagnostics gains a Visibility editor (owner/admin) or read-only chip (everyone else). - Runtime list shows a private/public chip next to the name. Tests - Go: canUseRuntimeForAgent truth table; CreateAgent / UpdateAgent end-to-end gate tests (admin / runtime owner / plain member); PATCH visibility owner / admin / member / invalid-value coverage. - Vitest: create-agent dialog disabled state on private/public runtimes, default-runtime selection skips locked rows; runtime detail visibility editor → mutation, read-only fallback. Migrating runtimes: existing rows default to private to preserve the "owner only" status quo. Owners switch to public via the detail page diagnostics card. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): apply timezone+visibility atomically; don't seed locked template runtime Two issues surfaced in review of MUL-2062: 1. PATCH /api/runtimes/:id ran the timezone branch first, which: - returned early on a tz no-op, silently dropping a concurrent `visibility` patch in the same body; - committed the timezone mutation (+ usage rollup rebuild) before validating visibility, so an invalid visibility left the row half-updated. Validate every field first, then run the mutations in order. The no-op short-circuit now only triggers when nothing else is requested. 2. The Create Agent dialog in duplicate mode unconditionally seeded `template.runtime_id` as the selected runtime, even when that runtime is now private and owned by someone else — the user saw a selected row they couldn't submit (Create → backend 403). Fall back to the first usable runtime when the template's runtime is locked, and gate the Create button on `selectedRuntimeLocked` as defense in depth. Tests: - Go: TestUpdateAgentRuntime_CombinedPatchAppliesBoth (tz no-op + visibility flip), TestUpdateAgentRuntime_InvalidVisibilityDoesNotMutateTimezone (atomic-fail invariant). - Vitest: duplicate template pointing at a locked runtime now seeds the first usable one; Create button stays disabled when no usable alternative exists. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 22:53:07 +08:00
Bohan Jiang	046e4b1efa	fix(execenv): switch every provider's Windows reply template to --content-file (#2411 ) Three user reports converge on the same Windows-shell encoding bug: - #2198 / #2236 — Chinese, Codex on Win11. Comments / descriptions generated by the agent arrive as `?`. - #2376 — Cyrillic, non-Codex agent ("Ops Lead") on Win11 Desktop. Title preserved (argv → CreateProcessW UTF-16), description / agent reply garbled (stdin → shell-codepage re-encoding). woodcoal's independent diagnosis on #2198 confirms the root cause: Windows PowerShell 5.1's `$OutputEncoding` defaults to ASCIIEncoding when piping to a native command, so non-ASCII bytes are silently replaced with `?` before they reach `multica.exe`. The CLI's stdin parsing is fine; the bytes are corrupted upstream, in the agent's shell layer. This PR ships the fix that supersedes the codex-only attempt in PR #2265 (which is closed in favour of this one): ## CLI Add `--content-file <path>` to `multica issue comment add` and `--description-file <path>` to `multica issue {create,update}`. The CLI reads bytes off disk via `os.ReadFile` and skips the shell entirely; UTF-8 survives end-to-end regardless of `$OutputEncoding` or `chcp`. The three input modes (`--content`, `--content-stdin`, `--content-file`) are mutually exclusive. ## Runtime config `buildMetaSkillContent`'s Available Commands section is rewritten as a neutral three-mode menu. The previous unconditional "MUST pipe via stdin" / `--description-stdin` mandate (over-spread from #1795 / #1851's Codex-multi-line fix) is gone for non-Codex providers; the strong directive now lives only in the Codex-Specific section, which branches on host: - Codex / Linux+macOS: `--content-stdin` + HEREDOC (preserves MUL-1467 fix against codex's literal `\n` habit). - Codex / Windows: `--content-file` (PowerShell ASCII pipe is the exact bug we're patching). ## Per-turn reply template `BuildCommentReplyInstructions` now takes a provider arg and branches provider × OS: - Windows + any provider → `--content-file` (the bug is shell-layer, not provider-layer; #2376 shows non-Codex agents on Windows also hit it). All providers write a UTF-8 file with their file-write tool and post via `--content-file ./reply.md`. - Linux/macOS + Codex → stdin/HEREDOC (MUL-1467 protection). - Linux/macOS + non-Codex → lightweight pre-#1795 inline `--content "..."`. The CLI server-side decodes `\n`, so escaped multi-line works; the agent retains stdin / file as escape hatches for richer formatting. `BuildPrompt` and `buildCommentPrompt` gain a `provider` arg; `daemon.runTask` already has it in scope. ## Tests - `TestResolveTextFlag` — file-source verbatim with non-ASCII (`标题 / Заголовок / 中文段落`), missing-file error, empty-file rejection, three-way mutual exclusion. - `TestInjectRuntimeConfigAvailableCommandsIsNeutral` — every non-Codex provider × {linux, darwin, windows} pins the three-mode menu present + over-spread "MUST stdin" substrings absent. - `TestInjectRuntimeConfigCodexLinuxEmphasizesStdin` + `TestInjectRuntimeConfigCodexWindowsUsesContentFile` — Codex section's per-OS branch. - `TestBuildCommentReplyInstructionsCodexLinux` + `TestBuildCommentReplyInstructionsNonCodexLinux` + `TestBuildCommentReplyInstructionsWindowsUsesContentFile` — the reply-template provider × OS matrix. - `TestInjectRuntimeConfigWindowsCommentTriggerHasNoStdin` — end-to-end AGENTS.md / CLAUDE.md on Windows has no prescriptive stdin directive, for claude / codex / opencode. `go test ./...` and `go vet ./...` clean. Closes #2198, #2236, #2376. Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 17:05:45 +08:00
Kagura	702c48209b	fix(agent): stop filtering Pi extension tools via hardcoded --tools allowlist (#2379 ) (#2381 ) The Pi backend hardcoded `--tools read,bash,edit,write,grep,find,ls` in buildPiArgs. Pi's SDK treats --tools as a restrictive allowlist: only the listed tools pass through `_refreshToolRegistry()`, silently filtering out any user-installed extension tools registered via `pi.registerTool()`. Omitting --tools makes Pi's `allowedToolNames` undefined, so the `isAllowedTool()` filter becomes a no-op and all tools — built-in and extension — are available. This matches Pi's standalone behavior. Users who want to restrict tools can still pass --tools via custom_args (it is not in piBlockedArgs). Closes #2379	2026-05-11 16:11:32 +08:00
Bohan Jiang	fae8558263	fix(daemon): self-heal when a runtime is deleted server-side (#2404 ) Closes #2391.	2026-05-11 16:09:40 +08:00
Bohan Jiang	f5c2994aed	feat(workspace): revoke a member's runtimes when they leave or are removed (#2401 ) * feat(workspace): revoke a member's runtimes when they leave or are removed Previously, leaving or being removed from a workspace only deleted the member row — every runtime the departed user owned in that workspace remained in the DB, kept its daemon_token valid, and stayed reachable to the workspace's other members. The departed user lost access but their machine kept doing work. This change converges the runtime state in the same transaction as the member-row deletion: agents pinned to those runtimes are archived, in-flight tasks are cancelled (so the daemon's per-task status poller interrupts the running agent gracefully), the runtimes are forced offline, and the daemon_token rows are deleted. After commit the DaemonTokenCache is invalidated and agent:archived / daemon:register events fire so connected clients reconcile immediately. Server-side state convergence is the production safety net; the daemon_token revoke takes effect once the mdt_ flow is live (today most daemons fall back to PAT/JWT, and the member-row deletion is what stops those requests via requireWorkspaceMember). Daemon-side handling (recognising the resulting 401/404 and tearing down the local pairing for that workspace) lands in a follow-up. Co-authored-by: multica-agent <github@multica.ai> * fix(workspace): also cancel tasks for archived agents on member revoke CancelAgentTasksByRuntime only matched tasks whose runtime_id was in the revoked set, missing a real path: agent.runtime_id can be reassigned via UpdateAgent, but agent_task_queue.runtime_id keeps the value from when the task was queued. So an agent currently bound to the leaving member's runtime gets archived correctly, but its older tasks still pinned to a prior runtime stay 'queued' — and ClaimAgentTask does not gate on agent.archived_at, so those orphaned tasks remain claimable by the prior runtime. Replace CancelAgentTasksByRuntime with CancelAgentTasksByRuntimeOrAgent, which OR-matches runtime_ids and the archived agent IDs in one UPDATE. Pass the archived agent IDs through from revokeAndRemoveMember. Adds TestDeleteMember_CancelsTasksFromAgentReassignment as a regression guard: same agent, two runtimes, the older task on the surviving runtime must end up cancelled while the surviving runtime stays online. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 15:06:50 +08:00
Bohan Jiang	02310d083e	docs(util): clarify EnsureHiddenConsole call-order contract (#2399 ) Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 14:45:54 +08:00
Kagura	fb026f2607	fix(daemon): suppress git console windows on Windows (#2358 ) * fix(daemon): suppress git console windows on Windows Apply the same HideConsoleWindow pattern used for agent processes (PR #1474) to all git commands spawned by the daemon's repo-cache, execenv, and GC packages. Each exec.Command now calls util.HideConsoleWindow(cmd) which sets CREATE_NEW_CONSOLE + HideWindow so grandchildren inherit a hidden console instead of flashing visible console windows. Closes #2357 Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com> * refactor: use EnsureHiddenConsole at daemon startup Replace per-site HideConsoleWindow(cmd) calls with a single EnsureHiddenConsole() invoked once at daemon startup. The daemon now owns a hidden console that every child process (git, cmd /c mklink, etc.) inherits automatically, eliminating the need for per-call SysProcAttr configuration. This also covers the previously missed exec.Command in codex_home_link_windows.go (cmd /c mklink) which never had a HideConsoleWindow call. Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com> --------- Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com> Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>	2026-05-11 14:41:07 +08:00
Multica Eve	d6349c16ec	feat(runtime): per-runtime timezone for token-usage aggregation (MUL-1950) (#2394 ) * feat: per-runtime timezone for token usage aggregation The runtime token-usage charts (daily and hourly tabs on the runtime-detail page) bucketed every event by the Postgres session timezone, which is UTC in production. For an operator in UTC+8 that meant a Tuesday afternoon's tasks landed in Tuesday early-morning's bar — the chart was always one off. Fix: store an IANA timezone on agent_runtime and aggregate under it. * migrations 081 / 082 add agent_runtime.timezone (TEXT NOT NULL DEFAULT 'UTC') and rebuild the rollup pipeline (window function and both trigger functions) to compute bucket_date with AT TIME ZONE rt.timezone instead of bare DATE(). * No historical backfill — task_usage_daily rows already on disk keep their UTC bucket_date; only future writes / re-touches recompute under the new tz. (Product call from MUL-1950: 'guarantee future correctness'.) * runtime_usage.sql gains a @tz parameter on ListRuntimeUsage and GetRuntimeUsageByHour and threads tz through GetRuntimeTaskHourly Activity. ListRuntimeUsageDaily reads bucket_date as-is since the rollup already wrote it in tz. * parseSinceParamInTZ replaces the raw N×24h cutoff with start-of- day-N in the runtime's tz so 'last 7 days' lines up with bucket boundaries. * Daemon registration sends the host's IANA tz (TZ env, then time.Local), and UpsertAgentRuntime preserves any user override via a CASE-on-existing-value pattern so a daemon reconnect can't silently revert the operator's setting. * New PATCH /api/runtimes/:id endpoint (UpdateAgentRuntime) lets the runtime detail page edit the tz; the editor seeds with the browser tz on first interaction. Refs: MUL-1950 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * fix: harden runtime timezone rollups Co-authored-by: multica-agent <github@multica.ai> * fix: address runtime timezone review nits Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica.ai> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Eve <eve@multica-ai.local>	2026-05-11 14:39:35 +08:00
Multica Eve	e79ffc0f01	fix(agent): expand Copilot CLI model catalog with correct dotted IDs (#2336 ) * fix(agent): expand Copilot CLI model catalog with correct dotted IDs The Copilot CLI provider only exposed two models in the runtime dropdown, and one of them used the dashed legacy form `claude-sonnet-4-6` which `copilot --model` rejects with "Model ... is not available". The CLI accepts dotted IDs (e.g. `claude-sonnet-4.6`, `gpt-5.4`). Sync `copilotStaticModels()` with the official supported-models catalog so the dropdown surfaces the full set the user's account can route to (8 OpenAI + 4 Anthropic), and add a regression test that pins the expected IDs and bans the dashed form. Closes MUL-1948. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * feat(agent): dynamic Copilot model discovery via ACP session/new The previous static catalog could only ever lag behind the user's real entitlements and what GitHub ships. Copilot CLI exposes the live catalog through its ACP server (`copilot --acp`): the `session/new` response includes `models.availableModels` plus `currentModelId`, scoped to the authenticated account. Wire copilot through the existing discoverACPModels helper — already used by hermes/kimi/kiro — so the dropdown reflects the account's real catalog, including the `auto` entry and per-tier model availability (Pro / Pro+ / Enterprise / evaluation models). The Copilot CLI puts itself into ACP server mode via the `--acp` flag instead of an `acp` subcommand, so acpDiscoveryProvider now takes an optional acpArgs override. Copilot's ACP payload omits the vendor name, so a small prefix-based inferCopilotProvider keeps the UI's openai / anthropic / google grouping working. When the binary is missing or auth fails, fall back to copilotStaticModels() so self-hosted runtimes without a copilot install still see a populated dropdown. Verified against `copilot 1.0.44`: live discovery returns 13 models with gpt-5.5 marked Default. Closes MUL-1948. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * fix(agent): drop no-op COPILOT_ALLOW_ALL env and generalize OpenAI o-series prefix check - discoverCopilotModels: remove COPILOT_ALLOW_ALL=1 (not a real Copilot CLI env var; copy-pasta from HERMES_YOLO_MODE=1). Discovery only drives initialize + session/new which never trigger tool-permission prompts, so no extra env is needed. - inferCopilotProvider: replace the o1/o3/o4 prefix chain with a generic o<digit>+ check via isOpenAIReasoningSeriesID, so future o5/o6/… reasoning models are tagged as openai automatically. Guards against false positives like 'opus-…' or bare 'o'. - Extend TestInferCopilotProvider with o5/o6 forward-compat cases and negative cases (opus-fake, omni, o). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 14:36:43 +08:00
Multica Eve	72e89a74f3	fix: surface copilot failure details (#2396 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 14:08:33 +08:00
Naiyuan Qing	a49222f37b	fix(realtime): allow same-origin WebSocket (mobile/CLI) (#2395 ) * fix(realtime): allow same-origin WebSocket clients (mobile/CLI) The previous CheckOrigin implementation (PR #2318) bypassed the Origin check whenever the request URL carried `client_platform=mobile` and no browser session cookie. That contract requires every native client to remember to add a query parameter — and in practice mobile clients hit ws://localhost:8080/ws with no extra params, so the Origin filled by the WebSocket library (the server's own host) gets rejected. Replace the platform-specific bypass with same-origin acceptance: if Origin's host equals the request Host, allow the upgrade. This is gorilla/websocket's default CheckOrigin behavior, restored alongside the existing cross-origin allowlist (for browser web/desktop clients). Native clients are now zero-config. CSRF defense is unaffected: SameSite=Strict cookies, the multica_csrf token, workspace membership check, and the allowlist itself remain in place. Browser CSWSH attacks fail both same-origin (browser forces Origin = page origin, not the server's Host) and allowlist checks. Refs: https://pkg.go.dev/github.com/gorilla/websocket https://cheatsheetseries.owasp.org/cheatsheets/WebSocket_Security_Cheat_Sheet.html Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(realtime): use case-insensitive Host comparison for same-origin HTTP host is case-insensitive (RFC 7230 §2.7.3), and gorilla/websocket's default checkSameOrigin uses equalASCIIFold(u.Host, r.Host). The plain == comparison would reject legitimate same-origin requests with a case-mismatched Host header (e.g. Host: LOCALHOST:8080 vs Origin: http://localhost:8080). Switch to strings.EqualFold and cover the case with a regression test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 13:42:42 +08:00
Bohan Jiang	b26f850d4e	feat(agents): gate private-agent surfaces with allowed_principals predicate (#2359 ) * feat(agents): gate private-agent surfaces with allowed_principals predicate Tighten chat/@-mention, history, edit, and delete entry points so private agents are only reachable by their owner or workspace owner/admin. Agent-to- agent traffic still bypasses the gate so A2A collaboration keeps working. - New canAccessPrivateAgent predicate in handler/agent_access.go; used by comment.enqueueMentionedAgentTasks (replacing the inline check), GetAgent, ListAgents (filter), ListAgentTasks, GetWorkspaceAgentRunCounts / Activity30d / TaskSnapshot (workspace-wide aggregations no longer leak private-agent existence + counts), chat.CreateChatSession, chat.SendChatMessage (re-checks on every send so role changes can't leave a stale session as a back-door), and autopilot.shouldSkipDispatch (caller = autopilot creator). - allowed_principals is computed inline as {agent.owner_id} ∪ workspace owner/admin members. No new table — manual config is intentionally not exposed in v1; the predicate is the extension seam. - Front-end agent detail page distinguishes 403 (private agent the caller can't access) from 404 (deleted/missing) and renders a "no access" placeholder with a back-to-agents button. - Go tests cover the pure predicate matrix + the four protected surfaces; vitest passes for the affected views. Co-authored-by: multica-agent <github@multica.ai> * feat(agents): gate issue assignment with the private-agent predicate Refactor validateAssigneePair to call the shared canAccessPrivateAgent helper. This closes the back door where a plain member could assign a private agent to an issue and let normal task dispatch run it, side- stepping the chat / @-mention gate. Agent callers (X-Agent-ID) bypass so A2A delegation onto a private assignee still works. Add an integration test covering all three callers (workspace owner, agent owner, plain member). Co-authored-by: multica-agent <github@multica.ai> * fix(agents): close three private-agent gate bypasses found in PR review 1. X-Agent-ID forgery (resolveActor): require X-Task-ID alongside X-Agent-ID before trusting the agent identity. Without this a plain workspace member could set X-Agent-ID to any visible agent UUID and short-circuit the gate to "actor=agent, allow". Daemons already pair the two headers, so legitimate A2A traffic is unaffected. 2. Chat history read path (chat.go): GetChatSession / ListChatMessages / GetPendingChatTask / MarkChatSessionRead now go through a new gateChatSessionForUser helper that re-applies canAccessPrivateAgent after the ownership check, so a session creator whose role was later downgraded loses transcript access. ListChatSessions and ListPendingChatTasks filter their result sets by the same predicate. 3. Cross-workspace @mention (comment.enqueueMentionedAgentTasks): resolve the mentioned agent via GetAgentInWorkspace scoped to the issue's workspace so a UUID belonging to a different workspace's private agent can't slip past the gate (the gate was being applied against the current workspace's role table, which is the wrong one). Regression tests cover each bypass, plus an update to the resolveActor unit test to reflect the new "X-Agent-ID without X-Task-ID falls back to member" contract. Co-authored-by: multica-agent <github@multica.ai> * test(handler): seed X-Task-ID alongside X-Agent-ID in existing agent-caller tests After tightening resolveActor to require both headers (X-Agent-ID + X-Task-ID) for the "agent" actor identity, three existing tests that set only X-Agent-ID started failing because their requests now resolve to "member" instead of "agent". Add createHandlerTestTaskForAgent helper and seed a task per agent-caller assertion. Also patch TestAgentExplicitMentionStillTriggers — it still passed only because the @mention path doesn't care about author type for member callers, but the test claims to exercise the agent path, so make it faithful. Co-authored-by: multica-agent <github@multica.ai> * test(handler): finish X-Task-ID seeding + fix cross-workspace mention test schema The previous CI run still failed in two places: 1. server/cmd/server integration tests — postCommentAsAgent → authRequestWithAgent only set X-Agent-ID, so resolveActor downgraded the request to "member" and the on_comment chain produced the wrong task counts. Fix: authRequestWithAgent now also sets X-Task-ID, fetched or seeded by a new ensureAgentTask(agentID) helper. 2. TestMentionAgent_RejectsCrossWorkspaceAgentUUID's hand-crafted comment INSERT was missing comment.workspace_id, which migration 025 made NOT NULL. Pass testWorkspaceID into the seed row. Build + vet clean locally; both packages compile. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 12:39:45 +08:00
Bohan Jiang	39e57b870f	fix(cli): allow --mode run_only on autopilot create/update (#2360 ) * fix(cli): allow --mode run_only on autopilot create/update The autopilot run_only dispatch path is wired end-to-end (handler accepts the mode, AutopilotService.dispatchRunOnly enqueues a task with AutopilotRunID, daemon resolves workspace via autopilot_run -> autopilot in ClaimTaskByRuntime and TaskService.ResolveTaskWorkspaceID). The CLI guard was added before those fixes landed and never removed. Drop the CLI rejection on both create and update so callers can pick the same modes the API and UI already support, and remove the stale "unstable" callout from the autopilots docs. Closes multica-ai/multica#2347 Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): advertise autopilot run_only in agent runtime instructions The runtime config injected into AGENTS.md / CLAUDE.md only listed `--mode create_issue` for autopilot create and didn't expose `--mode` on update at all. So even after the CLI guard was lifted, agents reading their harness instructions would still believe create_issue was the only choice — undermining the "agents operate the same surface as humans" intent. Update both lines to advertise create_issue\|run_only on create and on update, and add an InjectRuntimeConfig assertion so the runtime prompt can't drift away from the CLI surface again. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-10 14:12:34 +08:00
Bohan Jiang	15c3886302	docs(daemon): refresh stale comment for inline system prompt path (#2362 ) The inline path now carries the full runtime brief (CLI catalog, workflow steps, persona, skills, project context) rather than just identity/persona instructions, after #2353 / #2355. The pre-existing comment still described it as "identity/persona instructions inline", which would mislead future maintainers about why the inline payload is load-bearing. Also call out kiro/kimi alongside openclaw/hermes since they were added to providerNeedsInlineSystemPrompt in #2328, and document the concrete failure mode (issues stuck in todo) so the rationale is searchable. Co-authored-by: multica-agent <github@multica.ai>	2026-05-10 14:00:08 +08:00
Kagura	a6968c7485	fix(daemon): inline runtime brief for providers that need system prompt (#2355 ) InjectRuntimeConfig writes the full meta skill content (CLI catalog, workflow instructions, project context, skills) to workdir/AGENTS.md, but providers like OpenClaw, Hermes, Kiro, and Kimi read bootstrap files from their own agent workspace — not the task workdir. The inline system prompt path (providerNeedsInlineSystemPrompt) only passed the agent persona instructions, so these providers never received the runtime brief. Have InjectRuntimeConfig return the rendered content so the daemon can both write it to disk (for file-reading providers) and pass it inline (for workspace-isolated providers). This avoids double-rendering and keeps the file and inline payloads identical. Fixes #2353	2026-05-10 13:57:05 +08:00
Bohan Jiang	b73a301bf9	fix(agent): drain stderr before deciding ACP failure promotion (#2333 ) `hermes`, `kimi`, and `kiro` all wired stderr through `cmd.Stderr = io.MultiWriter(logWriter, providerErrSniffer)`. The OS-pipe → MultiWriter copy goroutine that exec spawns for that form is only joined by `cmd.Wait()`, which the lifecycle goroutine fires in deferred cleanup — after `promoteACPResultOnProviderError` already consulted the sniffer. When stopReason=end_turn (success) raced ahead of the stderr drain, the sniffer's `lines` slice was empty, the helper fell through to the synthetic agent-text fallback ("hermes provider error: API call failed after 3 retries"), and the actionable upstream signal (HTTP 429 / usage limit) was lost. This was visible as a flaky `TestHermesBackendPromotesProviderErrorWithNonEmptyOutput` in CI under high parallelism — a real prod bug, not a test issue: live runs hit the same race when an upstream LLM returns 429 and hermes' synthetic agent turn beats the stderr drain to the parent. Replace the MultiWriter wiring with `cmd.StderrPipe()` + an explicit copier goroutine that signals on `stderrDone`. The lifecycle goroutine already awaits `<-readerDone` for stdout; add `<-stderrDone` next to it before `promoteACPResultOnProviderError` runs. The deferred `cmd.Wait()` ordering is unchanged — it just becomes a cheap reap by the time it fires. Verified: `go test ./pkg/agent/ -run "TestHermes\|TestKimi\|TestKiro" -count=10 -race`, then full package `-count=3 -race`, all green. Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 17:34:25 +08:00
Bohan Jiang	d713b57072	fix(daemon): add kiro and kimi to providerNeedsInlineSystemPrompt whitelist (#2328 ) Kiro and Kimi share Hermes' ACP architecture and already accept SystemPrompt prepended in front of the user prompt (kiro.go:244-247, kimi.go:256-257). Without daemon-side opt-in, ExecOptions.SystemPrompt is never set, so per-task agent identity instructions are lost in deployments that rely on inline injection (e.g. K3 Lens-style daemon → wrapper → docker compose exec acp). Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 16:54:27 +08:00
LinYushen	f70105fb12	fix(agent): include JSON-RPC error data field in ACP error messages (#2327 ) ACP backends (Kiro, Hermes, Kimi) put the actionable reason for code=-32603 'Internal error' in the JSON-RPC `data` field, e.g. "No session found with id". The wrapped Go error only carried `code` and `message`, leaving operators staring at a bare "kiro session/prompt failed: session/prompt: Internal error (code=-32603)" with no way to tell apart session expiry, model unavailability, lost auth, or quota. Parse `data` too. Strings render unquoted; objects/arrays render as raw JSON; null/missing keeps the previous format unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-09 16:19:57 +08:00
Bohan Jiang	c57546159d	fix(daemon): mark provider 429 / out-of-credit agent runs as failed, not completed (#2323 ) * fix(daemon): mark provider 429 / out-of-credit runs as failed, not completed Two bugs combined to silently report failed agent runs as "Completed" in the UI when the upstream LLM returned a 4xx (e.g. HTTP 429 rate-limit / no credit on the account). 1. ACP backends (hermes, kimi, kiro) only promoted the run status to "failed" when their stderr sniffer fired AND the agent output buffer was empty. But hermes injects a synthetic agent text turn ("API call failed after 3 retries: HTTP 429...") on retry exhaustion, so the buffer was never empty in the rate-limit case and the promotion never ran. Drop the empty-output precondition: the sniffer's regex (HTTP-status markers, named error types) is specific enough to trust on its own. 2. The daemon's task-result switch only routed "blocked" through FailTask; every other status — including "cancelled", and any future status we forget to enumerate — fell through to CompleteTask. Invert it so only an explicit "completed" status reports success, and extract the switch into reportTaskResult for direct testing. Cancelled now defaults to failure_reason "cancelled" instead of being silently completed. Closes GitHub multica#1952. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): only promote ACP run to failed on terminal provider error Address GPT-Boy's review on the multica#1952 fix. The previous promotion rule ("any sniffer line → fail") was too broad: the existing sniffer also captures transient per-attempt warnings ("API call failed (attempt 1/3): RateLimitError [HTTP 429]"), and those lines stay in the buffer for the rest of the run. A retry sequence whose first attempt blipped but whose third attempt succeeded would have been wrongly reported as failed. Tighten the criteria with two additional signals, both defined on the existing acpProviderErrorSniffer / output buffer: - acpTerminalErrorRe — sticky `terminal` flag set when stderr shows an exhausted/non-retryable marker (❌, [ERROR], "after N retries", Non-retryable, BadRequestError, AuthenticationError). Per-attempt warnings deliberately don't match. - acpAgentOutputTerminalRe — matches the synthetic "API call failed after N retries..." turn that hermes-style adapters inject into the agent text stream when they give up; this catches multica#1952 even if hermes' stderr only logged transient attempts. Promotion logic becomes a shared helper, promoteACPResultOnProviderError, called from hermes / kimi / kiro. Promotes when (a) terminalMessage is non-empty, (b) output contains the synthetic give-up turn, or (c) output is empty and the sniffer captured anything at all (preserves the original empty-output safety net for transient-only sequences with no real result to fall back on). Tests: - TestHermesProviderErrorSnifferTerminalVsTransient — transient attempt 1/3 alone returns terminalMessage="" but message!=""; a follow-on terminal marker flips terminal on. - TestHermesProviderErrorSnifferTerminalNonRetryable — confirms BadRequest / Authentication / Non-retryable / ❌ / [ERROR] are classified terminal even on the very first attempt. - TestHermesBackendDoesNotPromoteOnTransientRetry — fake hermes emits attempt 1/3 to stderr then a normal agent text turn and end_turn; resulting Status must stay "completed". Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 16:13:12 +08:00
Bohan Jiang	003dfd9b4b	feat(quick-create): add project picker that remembers last pick (#2321 ) * feat(quick-create): add project picker that remembers last pick Quick-create users targeting one project repeatedly had to restate "in project X" in every prompt. The modal now exposes a project picker beside the agent picker, persists the selection per-workspace, and pins the agent's `multica issue create` invocation to that project so the prompt text doesn't have to. The picked project also flows to the daemon as ProjectID/ProjectTitle and its github_repo resources override the workspace repo fallback — same treatment issue-bound tasks already get. Co-authored-by: multica-agent <github@multica.ai> * fix(quick-create): move project picker into property pill row Reviewer feedback: the picker felt out of place wedged next to the agent header. Move it into a property toolbar row above the footer, reusing the shared `ProjectPicker` + `PillButton` so its placement and styling line up exactly with the manual create panel. This also drops the bespoke dropdown / aria / label strings that were only needed while the picker rendered inline beside "Created by". Co-authored-by: multica-agent <github@multica.ai> * fix(quick-create): clear stale persisted project + carry across mode switch Two review-blocking bugs in PR #2321: 1. The stale-id sweep in AgentCreatePanel only fired when projects.length > 0 and only cleared local state, leaving lastProjectId pointing at a deleted project. The next open re-seeded the dead UUID and submit hit the server's `project not found` rejection. Gate on the query's `isSuccess` so we can tell "loading" apart from "loaded as empty", and clear both local state and the persisted preference when the selection isn't in the resolved list. 2. ManualCreatePanel's switchToAgent dropped the picked project from the carry payload, so flipping manual → agent silently fell back to the agent panel's own lastProjectId — potentially routing the issue to a different project than the one shown in manual mode. Forward project_id alongside prompt / agent_id, and add a regression test. Co-authored-by: multica-agent <github@multica.ai> * test(quick-create): pass new isExpanded props in stale-project tests Main got an expand button on AgentCreatePanel via #2320 while this branch was open, adding `isExpanded` / `setIsExpanded` to the panel's required props. The two new stale-project tests still passed `{ onClose }` only, which CI's typecheck (run on the main+branch merge) caught while my local run did not. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 16:12:12 +08:00
Bohan Jiang	3f20999597	refactor(timeline): drop server-side comment + timeline pagination (#2322 ) * refactor(timeline): drop server-side comment + timeline pagination (MUL-1929) The cursor-paginated /timeline and /comments endpoints were sized for a problem the data shape doesn't have: prod p99 is ~30 comments per issue and the all-time max is ~1.1k. Time-based pagination also splits reply threads across page boundaries (orphan replies), which the frontend was papering over with an "orphan rescue" that promoted disconnected replies to top-level — confusing UX with no real benefit. Replace both endpoints with a single full-issue fetch, capped server-side at 2000 rows as a defensive safety net (never hit in practice). Server - /api/issues/:id/timeline now returns a flat ASC TimelineEntry[] (matches the legacy desktop contract — older Multica.app builds keep working because the wrapped TimelineResponse + cursors are gone, and the raw array shape was always what they consumed). - /api/issues/:id/comments drops limit/offset; only ?since is honoured for the CLI agent-polling flow. - Drop ListCommentsBefore/After/Latest, ListActivitiesBefore/After/Latest and the timelineCursor encoding. - Replace with ListCommentsForIssue / ListCommentsSinceForIssue / ListActivitiesForIssue (capped by argument). CLI - multica issue comment list drops --limit / --offset and the X-Total-Count reporting; --since is preserved for incremental polling. Frontend - Replace useInfiniteQuery with useQuery in useIssueTimeline; drop fetchOlder/Newer, jumpToLatest, isAtLatest, newEntriesBelowCount. - Remove timeline-cache helpers (mapAllEntries / filterAllEntries / prependToLatestPage) and the TimelinePage / TimelinePageParam types. - WS event handlers update the single flat-array cache directly. - Drop the orphan-reply rescue in issue-detail — every reply's parent is now guaranteed to be in the same array. - Strip the "show older / show newer / jump to latest" buttons and their i18n strings. Co-authored-by: multica-agent <github@multica.ai> * fix(timeline): address review feedback on pagination removal Three issues caught in PR #2322 review: 1. /timeline broke for stale clients between #2128 and this PR. They send ?limit/?before/?after/?around and parse with the wrapped TimelinePageSchema; the new flat-array response was failing schema validation and falling back to an empty timeline. Restore the wrapped shape on those query params (DESC entries, null cursors, has_more_=false), keeping the flat ASC array for bare requests. Around-mode now also fills target_index from the merged slice so legacy clients can still scroll-to-anchor without a follow-up. 2. The agent prompts in runtime_config.go and prompt.go still told agents that `multica issue comment list` accepts --limit/--offset and to use `--limit 30` on truncated output. With those flags removed in this PR, new agent runs would hit "unknown flag" or skip context. Update the prompt copy to "returns all comments, capped at 2000; --since for incremental polling". 3. useCreateComment's onSuccess was a bare append to the timeline cache with no id-dedupe, so a fast comment:created WS event firing before onSuccess produced a transient duplicate. Restore the id guard the old prependToLatestPage helper used to provide. Adds two new boundary tests: - TestListTimeline_LegacyWrappedShape_OnPaginationParams - TestListTimeline_LegacyWrappedShape_AroundFillsTargetIndex Co-authored-by: multica-agent <github@multica.ai> test(handler): fix timeline test assertions for handler-package isolation The TestListTimeline_* assertions assumed CreateIssue would seed an "issue_created" activity_log row, but the activity listener that publishes those rows is registered in cmd/server/main.go — handler-package tests don't wire it up. CI saw 5 entries (3 comments + 2 activities) where the test expected ≥6. Drop the auto-activity assumption: assert exactly 5 entries in TestListTimeline_MergesCommentsAndActivities, and tighten TestListTimeline_EmptyIssue to assert a fully-empty timeline. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 16:11:58 +08:00
Bohan Jiang	9ded462ecc	feat(inbox): auto-archive stale task_failed rows on terminal status (#2319 ) When an issue progresses to in_review / done / cancelled, archive any pre-existing task_failed inbox rows for that issue across all member recipients and emit inbox:batch-archived per recipient so connected clients self-heal. Reuses the existing archived column rather than introducing a parallel dismissed flag; the activity log preserves the full failure history for audit independently of the inbox surface. Closes #2291. Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 15:53:25 +08:00
Multica Eve	4b8939e78e	fix: allow mobile websocket origin without cookies (#2318 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 15:14:16 +08:00
Multica Eve	a2dd80d4f6	feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899) (#2311 ) * feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899) Prevents scheduled autopilots from accumulating doomed tasks against offline / archived / unbound agents. Before this change, a paused laptop or crashed daemon would let a 5-minute-cron autopilot pile up thousands of queued agent_task_queue rows that no runtime would ever drain — this is the dominant source of the 89k stuck-task backlog flagged in MUL-1899. DispatchAutopilot now performs a pre-flight admission check on the assignee agent's runtime status. If the runtime is not 'online' (or the agent is archived / has no runtime bound / has no assignee), the run is recorded as 'skipped' with a failure_reason and no task is enqueued. Skipped runs still emit autopilot:run.done so the UI / activity feed reflect that the trigger fired and was evaluated. Skipped runs are deliberately NOT counted toward the failure-ratio auto-pause: a user who closes their laptop overnight should not have their autopilot paused. Sustained server-side failures keep their existing pause path via the failure monitor. Tests: added an integration test that creates an offline runtime and asserts DispatchAutopilot records a skipped run with no task enqueued. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * feat(scheduler): expire stale queued tasks via TTL sweeper (MUL-1899) Companion to the dispatch-time admission gate added in this PR. The admission gate prevents new tasks from being enqueued against an offline runtime, but it does not drain the historical backlog (~89k stuck queued rows observed at MUL-1899 baseline) and does not help when a runtime goes offline after a task has already been queued. This adds a passive TTL sweeper: - New SQL query `ExpireStaleQueuedTasks` transitions queued tasks older than the TTL to status='failed' with failure_reason='queued_expired' and a clear error message. - Sweep is capped per tick (`queuedExpireBatchSize`, default 500) via a CTE+LIMIT so that draining a large backlog cannot monopolise the DB on a single tick. At 30s ticks the worst case is 60k rows/hour. - Wired into the existing 30s `runRuntimeSweeper` loop alongside `sweepStaleTasks` and reuses `taskSvc.HandleFailedTasks` so the expired tasks broadcast `task:failed` events, reconcile agent status, and roll back any in-progress issues — same lifecycle as any other failed task. - Default TTL = 2h. Conservatively above any reasonable "queued behind a long-running task" window (default agent timeout is 2h, sweeper runs every 30s) so legitimate work isn't expired. - Integration tests cover the happy path (stale → expired, fresh → left alone, correct status/reason/error) and the per-tick batch cap. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * fix(autopilot): address review blockers from PR #2311 (MUL-1899) GPT-Boy review of the offline-runtime + queued-TTL PR flagged four blockers; this commit addresses them all. 1. Restore the 'skipped' autopilot_run status in the DB constraint. Migration 043 had removed 'skipped' along with the now-defunct concurrency_policy feature, so the new admission gate's INSERT of status='skipped' violated `autopilot_run_status_check` and broke `TestAutopilotDispatchSkipsWhenRuntimeOffline` in CI. New migration 079 re-adds 'skipped' to the CHECK list. The down migration migrates skipped → failed before re-tightening, mirror- ing what 043 did for the original removal. 2. Make `ExpireStaleQueuedTasks` race-safe. The CTE-then-UPDATE pattern could clobber a task that the daemon claimed between victim selection and the outer update. Two guards added: - `FOR UPDATE SKIP LOCKED` in the CTE so we never wait on a row that's currently being claimed (and never block the claim path either). - The outer UPDATE now re-checks `t.status = 'queued'` AND the TTL predicate so even if a row's lock is released after a successful claim, we cannot transition a now-dispatched/ running task to 'failed'. 3. Add a partial index for the queued-TTL sweeper. `idx_agent_task_queue_queued_created_at` on `created_at WHERE status = 'queued'` — keeps the 30s sweep query (status=queued AND created_at < ... ORDER BY created_at LIMIT 500) cheap even when historical terminal rows accumulate (~89k+ at MUL-1899 baseline). The partial predicate keeps the index tiny because only in-flight rows live in 'queued'. 4. Fix the failure-monitor denominator. `SelectAutopilotsExceedingFailureThreshold` had been counting 'skipped' toward total runs, which would have diluted the failure ratio: a 100%-failing autopilot could mask itself behind a wall of admission skips. With 'skipped' restored as a real status, the auto-pause monitor must explicitly exclude it from BOTH numerator and denominator — admission skips are neither a success nor a failure. Verified: `go test ./cmd/server/... ./internal/service/...` passes (including TestAutopilotDispatchSkipsWhenRuntimeOffline, TestExpireStaleQueuedTasks, TestExpireStaleQueuedTasksRespectsBatch Limit). `go build ./... && go vet ./...` clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * fix(migrations): split queued-task TTL index into concurrent migration Per PR #2311 review: agent_task_queue is a hot table, so building the new partial index with plain CREATE INDEX inside migration 079 would hold ACCESS EXCLUSIVE on the queue and block dispatch during deploy. The migration runner does not allow CONCURRENTLY to share a file with other statements (documented in 068), so split the index into its own single-statement file 080 — matching the existing pattern in 035 / 067 / 074 / 075 / 078. Migration 079 keeps the autopilot_run constraint change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 15:07:57 +08:00
Bohan Jiang	6d9ebb0fdd	fix(daemon): unblock issues stuck on a poisoned-image agent session (#2314 ) * fix(daemon): treat upstream API 400 invalid_request_error as poisoned session A markdown-linked image in an issue description that the agent downloads as a tiny CDN auth-error file and Read's as a PNG poisons the conversation: the LLM API rejects the bad image with 400 invalid_request_error, the session_id is pinned mid-flight, and every follow-up task on the issue (comment-trigger, auto-retry) resumes the same poisoned conversation and hits the same 400 — the issue can no longer be executed even after the description is cleaned up. Mirror the existing fallback-output classifier on the error side: detect "API Error: ... 400 ... invalid_request_error" in the agent error string, persist failure_reason='api_invalid_request', and add it to the GetLastTaskSession exclusion list so the next task starts a fresh session that re-reads the (now-clean) description. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): unblock issues already poisoned by API 400 invalid_request_error The forward-only classifier from the previous commit only tags new failures. Issues like MUL-1918 already have multiple failed-task rows whose failure_reason is the pre-fix default 'agent_error', and GetLastTaskSession falls back to those legacy rows on the next claim — so deploying the classifier alone leaves existing poisoned issues stuck (GPT-Boy review on PR #2314). Two complementary changes: - Migration 079 backfills failure_reason='api_invalid_request' on every pre-existing 'agent_error' row whose error text matches the canonical Anthropic 400 invalid_request_error shape. Keeps observability consistent (multica issue runs / UI now report the right reason). - GetLastTaskSession adds a defensive ILIKE clause on error text. Closes the deploy-window gap where the old binary could write a new 'agent_error' row between the migration running and the new code taking over, and protects against future error-format variants the daemon classifier might miss. Plus regression tests covering the legacy + new coexistence case GPT-Boy flagged, and a guard rail asserting benign 'agent_error' failures (timeouts, tool errors) still resume their session. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 14:39:10 +08:00
Valentin Mihov	560e081d8f	Pass agent instructions inline to Hermes (#2283 )	2026-05-09 14:23:41 +08:00
Multica Eve	46eed3b298	Add task dispatched analytics event (#2310 ) Co-authored-by: Devv <devv@Devvs-Mac-mini.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 14:11:20 +08:00
Bohan Jiang	0eb23df234	fix(agent): scope pi colon-to-slash normalization to legacy format (#2309 ) PR #2281 added table-format support to parsePiModels but kept the unconditional `strings.Replace(":", "/", 1)`, which would silently rewrite a `:` inside a model name read from column 1 of the table output (e.g. `claude-sonnet-4-6:exp` would become `claude-sonnet-4-6/exp`). Move the replace into the legacy `provider:model` branch so only the colon-as-separator case is normalized, and restore a short doc comment describing the dual- format contract. Test extended with a colon-bearing table row. Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 13:56:49 +08:00

1 2 3 4 5 ...

753 Commits