* feat(issues): server-side label + filter querying for issue list
Extends GET /api/issues with label_ids, priorities, creator_ids,
project_ids, include_no_assignee, and include_no_project params, and
moves the existing single-value filters onto array-form. Each filter
becomes part of the SQL WHERE clause so paginated buckets reflect the
user's selection — fixes the bug where client-side filtering hid
matches sitting past the first page (#1491).
CLI gains a repeatable --label flag; legacy --priority/--assignee/
--project keep working via the single-value compatibility paths.
* feat(issues): drive workspace + my-issues filters from the server
issueListOptions and myIssueListOptions now key the React Query cache
on a normalized filter object, so each filter combination has its own
cache entry and a filter change re-fetches with the wire-shape filter
applied server-side. Drops the client-side filterIssues step on the
issues page, my-issues page, and project detail — that step silently
hid matches that lived past the first paginated page (#1491).
Adds a Label submenu to the workspace issues filter dropdown, plus
labelFilters in the view store. Mutations and ws-updaters fan their
optimistic patches across every filter-keyed list cache via
qc.setQueriesData on issueKeys.listPrefix(wsId), and the editor's
mention-suggestion reads from any matching list cache for instant
first paint regardless of which filter is active.
* fix(issues): route Members/Agents scope through server-side filter
The Members/Agents scope tabs on the workspace issues page were still
narrowing client-side via `assignee_type === 'member'`. That hits the
exact pagination-blind bug this PR is meant to fix: if the first 50
issues per status don't include the right assignee type, the tab
shows "No issues" while later pages have matches.
Adds an `assignee_types text[]` filter to ListIssues / ListOpenIssues /
CountIssues, threads it through the API client, normalizer and view
filter, and maps the scope tab to it. Each scope now keys its own
list cache and refetches with the correct first page.
Also disables the My Issues "My Agents" query when the user owns no
agents — `assignee_ids: []` was getting dropped by both the API client
and the query-key normalizer, so the request went out unfiltered and
surfaced unrelated issues under "My Agents".
Follow-ups to #1765 review nits:
- Tighten the per-turn prompt and AGENTS.md workflow instructions so
that "exit with no output" only applies when the trigger is from
another agent AND no actual work was produced this turn. If the
agent did real work, the standard "post results as a comment" rule
still applies — a result reply is not a noise comment.
- Add TestAgentExplicitMentionStillTriggers as a positive control
documenting the boundary the structural fix preserves: suppressing
implicit parent-mention inheritance for agent authors does NOT
block deliberate handoffs. An agent that explicitly @mentions
another agent in its own content still enqueues a task for the
mentioned agent and does not self-trigger.
When an agent replied in a thread whose root mentioned another agent,
the reply inherited the parent mention and re-triggered the other agent.
This caused 'No reply needed' ping-pong loops between co-assigned agents.
Structural fix:
- In enqueueMentionedAgentTasks, suppress parent-mention inheritance
when authorType == 'agent'. Explicit @mentions in the agent's own
comment still work for deliberate handoffs.
Defense-in-depth (prompt):
- Strengthen per-turn prompt and AGENTS.md workflow instructions to
explicitly forbid posting 'No reply needed' noise comments.
Regression test:
- TestAgentReplyDoesNotInheritParentMentions covers both the fix
(agent reply does not re-trigger) and the positive control
(member reply still inherits mentions).
Also updates TestBuildPromptCommentTriggeredByAgent to match the
new prompt wording.
* refactor(server): make ParseUUID error-returning to prevent silent data loss (MUL-1410)
util.ParseUUID previously swallowed errors and returned a zero pgtype.UUID
on invalid input. When this zero UUID reached a write query (DELETE/UPDATE),
the SQL matched zero rows and the handler returned 2xx success — producing
silent data corruption. #1661 (DeleteIssue with identifier-style ID) was the
visible symptom; PR #1680 patched that one site, this commit closes the
class of bug.
Changes:
- util.ParseUUID now returns (pgtype.UUID, error). Add util.MustParseUUID
for trusted round-trips that should panic on invalid input.
- handler/handler.go: parseUUID wrapper now calls MustParseUUID — any
unguarded user-input string reaching it surfaces as a recovered panic
(chi middleware.Recoverer → 500) instead of silently corrupting data.
Add parseUUIDOrBadRequest(w, s, fieldName) for handler entry points.
- Convert every Queries.Delete*/Update* call site reachable from raw user
input (autopilot, comment, project, skill, skill_file, label, pin,
attachment, feedback, issue assignee, daemon runtime, workspace) to
validate UUIDs explicitly with parseUUIDOrBadRequest, returning 400 on
invalid input. Where a resolved entity.ID is already in scope, write
queries now use it directly instead of re-parsing the URL string.
- Update getWorkspaceMember + loadIssueForUser to handle invalid UUIDs
gracefully (404/400 instead of panic).
- Update util/middleware/cmd-level callers (subscriber_listeners,
notification_listeners, activity_listeners, scope_authorizer,
middleware/workspace) to use the error-returning API.
- Add server/internal/util/pgx_test.go covering valid/invalid input and
the MustParseUUID panic contract.
- Add TestDeleteIssueByIdentifier + TestDeleteIssueRejectsInvalidUUID
regression tests in handler_test.go (the original #1661 bug + the
invalid-input case).
- Document the handler UUID parsing convention in CLAUDE.md so the rule
is enforceable in future PR review.
* fix(server): address GPT-Boy review of #1748
P1 fixes from PR #1748 review:
1. Migrate remaining request-boundary UUIDs to parseUUIDOrBadRequest so
malformed input returns 400 instead of panic/500. Was missing on:
- issue.go: workspace_id in CreateIssue/ChildIssueProgress/ListIssues/
SearchIssues/BatchUpdateIssues/BatchDeleteIssues; project_id /
parent_issue_id / lead_id / assignee_id / assignee_ids / creator_id
filters; batch issue_ids and assignee/parent/project fields in
BatchUpdateIssues (skip on bad input via util.ParseUUID, matching
the existing per-row continue semantics).
- project.go: project id + workspace_id in GetProject/UpdateProject/
DeleteProject; lead_id in CreateProject/UpdateProject;
workspace_id in ListProjects + SearchProjects.
- handler.go: resolveActor now uses util.ParseUUID for X-Agent-ID /
X-Task-ID headers; invalid UUID falls back to "member" (matches
pre-existing semantics) instead of panicking.
- issue.go: validateAssigneePair returns 400 on invalid workspace_id
instead of panicking.
2. Fix issue:deleted WS event payloads to emit uuidToString(issue.ID)
instead of the raw URL string. After an identifier-path delete
("MUL-7"), the previous payload would have leaked the identifier to
subscribers, leaving stale entries in frontend caches that key by
UUID. Updated DeleteIssue (issue.go:1341) and BatchDeleteIssues
(issue.go:1641). The slog "issue deleted" log line also now records
the resolved UUID so logs match the WS payload.
3. Extend TestDeleteIssueByIdentifier to subscribe to the bus and
assert issue:deleted.payload.issue_id is the resolved UUID, not
the identifier.
* fix(server): validate remaining reviewed UUID inputs
* fix(server): validate remaining handler UUID inputs
* fix(server): finish request boundary UUID audit
* fix(server): validate remaining request body UUIDs
* fix(server): validate runtime path UUIDs
* fix(server): validate remaining audit UUID inputs
---------
Co-authored-by: Eve <eve@multica.ai>
When a user deletes a comment that triggered an agent task, the agent
would still run with the now-deleted content baked into its prompt
(fetched at task claim time) — manifesting as "the agent still sees the
deleted comment". The FK ON DELETE SET NULL only nullified
trigger_comment_id; the queued task itself was never cancelled.
DeleteComment now cancels any queued/dispatched/running task whose
trigger is the deleted comment, before the comment row is removed.
* fix(comments): preserve newlines from agent CLI writes
Agents (e.g. Codex) routinely emit `multica issue comment add --content
"para1\n\npara2"` because Python/JSON-style string literals are their
default. Bash does not expand `\n` inside double quotes, so the literal
4-char sequence flowed through the CLI into the database and rendered
as text in the issue panel — comments came out as one wall of prose.
Three coordinated fixes so the platform behavior no longer depends on
whether a given model has strong bash-quoting intuition:
- CLI: decode `\n / \r / \t / \\` in `--content` and `--description` for
`issue create / update / comment add` (callers needing a literal
backslash still have `--content-stdin`).
- Agent prompt: rewrite the comment-add example in the injected runtime
config to require `--content-stdin` + HEREDOC for any multi-line body,
and call out the same rule for `--description`. The previous wording
flagged stdin only for "backticks, quotes", which models read as
irrelevant to plain paragraphs.
- Renderer: add `remark-breaks` to the shared Markdown plugin chain so a
bare `\n` becomes a visible line break instead of a CommonMark soft
break — protects against models that emit single newlines for
formatting.
Tests: pin the new CLI helper, and pin the runtime-config guidance so
the multi-line wording cannot decay back into a footnote.
* fix(comments): address review feedback on newline-rendering PR
- Cover the issue panel: ReadonlyContent (used by every comment card and
the issue description) has its own react-markdown wiring; add
remark-breaks there too so the renderer fix actually applies to the
surface the bug was reported on, not just the chat panel. Pinned by
ReadonlyContent line-break tests.
- Make the prompt's `--description` guidance executable: add
`--description-stdin` to `issue create` / `issue update`, refactor
comment-add to share a single `resolveTextFlag` helper, and have the
injected runtime config name the real flag instead of an imaginary
"stdin / a tempfile" path. Pinned by the runtime-config guidance test.
- Document the unescape contract on each affected flag's help text and
pin the precise boundary in tests: `\n / \r / \t / \\` are decoded;
`\d / \w / \s / \u / \0` and other unrecognised escapes pass through
verbatim, so regex literals and Windows paths survive intact unless
they embed a literal `\n` / `\r` / `\t`. Callers that need the literal
sequence have `--content-stdin` / `--description-stdin` as the escape
hatch.
* feat(issues): render labels on list/board with bulk server-side fetch
ListIssues / ListOpenIssues / GetIssue now bulk-fetch labels per response
via a new ListLabelsForIssues query so the client gets labels in a single
round-trip instead of N requests per visible issue. List-row and board-card
read issue.labels directly; an issue_labels:changed WS handler patches the
list and detail caches in place so chips stay live across tabs, and
attach/detach mutations mirror their result into the same caches for
immediate same-tab feedback.
Adds a "Labels" toggle to the card properties dropdown (defaults on).
* fix(issues): preserve cached labels and refresh on label edit/delete
Three fixes from gpt-boy's review of #1741:
1. IssueResponse.Labels was a non-omitempty slice, so paths that didn't
load labels (UpdateIssue, batch updates, the issue:updated WS broadcast)
serialized labels:null. onIssueUpdated then merged that null into the
list/detail caches, wiping chips on every other tab whenever any non-
label field changed. Switched to *[]LabelResponse + omitempty: nil =
field absent (client merge keeps existing labels); non-nil (incl. empty
slice) = authoritative.
2. issue.labels is a denormalized snapshot, but useUpdateLabel /
useDeleteLabel and the WS label:* prefix only touched labelKeys, leaving
stale chips in list/board after rename/recolor/delete. Mutations now
also invalidate issueKeys.all(wsId), and the realtime refreshMap maps
the label prefix to both labels and issues invalidation for cross-tab.
3. Persisted cardProperties from before this branch lacks the new `labels`
key. Render fell back to `?? true` but the dropdown switch read it raw
and showed unchecked. Added a custom Zustand merge that deep-merges
cardProperties so newly added toggles inherit defaults for existing
users; dropped the `?? true` fallbacks now that the store guarantees
the key.
* feat(labels): add issue label CRUD + attach/detach handlers (#1191)
The issue_label and issue_to_label tables were scaffolded in 001_init.up.sql
but never wired to any code path. This commit ships the backend for #1191:
- Migration 048: adds created_at/updated_at timestamps + workspace-scoped
case-insensitive unique index on label names
- sqlc queries for label CRUD + issue<->label attach/detach + batch list
(ListLabelsByIssueIDs for board/list views)
- HTTP handlers: /api/labels CRUD, /api/issues/{id}/labels attach/detach
- Protocol events: label:{created,updated,deleted} + issue_labels:changed
- Handler tests covering CRUD, duplicate-name conflict, invalid-color,
attach/detach idempotency, and cross-workspace isolation
* feat(cli): add label and issue label subcommands (#1191)
- multica label {list,get,create,update,delete}
- multica issue label {list,add,remove}
Both follow existing CLI conventions (JSON/table output, flag shapes)
and exercise the /api/labels endpoints shipped in the previous commit.
* feat(web): add labels UI — picker with inline create + management dialog (#1191)
Exposes the backend label feature to users via the existing issue-detail
sidebar.
- `@multica/core/types/label` — Label, CreateLabelRequest, UpdateLabelRequest,
plus response envelopes
- `@multica/core/api/client` — 8 methods for label CRUD and issue↔label
attach/detach
- `@multica/core/labels` — labelKeys, queryOptions, and mutation hooks with
optimistic updates (matches the project/ module layout)
- WS event type literals extended for label:{created,updated,deleted} and
issue_labels:changed
- `views/labels/label-chip.tsx` — colored pill; uses relative luminance
(ITU-R BT.601) to pick #111827 or #f9fafb text so chips stay readable on
both pastel and saturated backgrounds
- `views/issues/components/pickers/label-picker.tsx`
- Multi-select combobox in the issue sidebar
- When 0 labels: "Add label" trigger
- When 1+ labels: the chips themselves are the trigger; × on each chip
detaches without opening the picker
- Inline create: typing a new name + Enter creates with a hash-derived
color and attaches in one motion (matches Linear/GitHub)
- "Manage labels…" footer opens a dialog containing the full workspace
panel — users never leave the issue context to rename/recolor/delete
- `views/issues/components/labels-panel.tsx` — workspace labels manager.
Single-row create form (color swatch + name + Add button). Each label
row supports inline rename + recolor + delete (with confirm dialog).
Color input uses the browser's native picker for full-gamut access —
no preset palette clutter.
- `PropRow label="Labels"` added to the issue-detail sidebar below Project
Labels are issue metadata everyone uses — not admin configuration.
Putting them in Settings next to destructive workspace actions misframed
them; adding a top-level nav entry or a sibling tab to the Issues page
added surface area that wasn't earning its keep for a feature users
touch occasionally. Keeping management in a dialog launched from the
picker itself keeps users in their issue context and matches how GitHub
handles label editing from the label selector.
Merge the two symlink removal branches in exposeSharedCodexPluginCache —
they shared the same os.Remove + recreate path with only the error label
differing. The branch is now keyed off Lstat's ModeSymlink bit, with
Readlink reused only to fast-path an already-correct link. Behaviour is
unchanged; just less duplicated code.
CancelTasksForIssue silently dropped the list of affected tasks, so
whenever an issue transitioned to "cancelled" or "done" while a task was
still active (6 call sites in issue.go), the underlying agent was left
stuck at status="working" indefinitely and required a manual
`multica agent update <id> --status idle` to self-correct. This matches
the symptom reported in #1587: task rows move to "cancelled" via a
non-user-initiated path, agent status never reconciles.
Change CancelAgentTasksByIssue from :exec to :many (also tack on
completed_at = now() for consistency with CancelAgentTasksByIssueAndAgent),
then update CancelTasksForIssue to iterate the returned rows and call
ReconcileAgentStatus + broadcast task:cancelled per affected task —
mirroring the pattern already used by CancelTask and RerunIssue.
No test added; the change is small and mirrors well-covered paths.
Happy to add a mock-backed test in a follow-up if reviewers prefer.
Refs #1587
Refs #1149
Expose the shared Codex plugin cache inside each per-task CODEX_HOME before launch so plugin-provided skills are available on the first session.
Refresh agent-assigned workspace skills for both newly prepared and reused Codex environments, and cover plugin cache plus reuse behavior with focused execenv tests.
* fix(server): validate assignee_id existence on issue create/update
POST /api/issues and PUT /api/issues/:id silently accepted any
well-formed UUID as assignee_id (#1662). The new validateAssigneePair
helper consolidates the existing canAssignAgent check and adds:
- existence lookup against workspace members for assignee_type=member
- existence lookup against workspace agents for assignee_type=agent
- pair consistency: type and id must be both set or both null
- whitelist for assignee_type values (member|agent)
UpdateIssue and BatchUpdateIssues now run the same validator on the
post-merge assignee pair whenever the caller touches either field,
closing the parallel gap on the update path.
* fix(server): reject malformed assignee_id at handler entry
parseUUID silently returns an invalid pgtype.UUID for unparseable input
and validateAssigneePair treats (type unset + id invalid) as "no
assignee". Together they let `POST /api/issues` and `PUT /api/issues/:id`
silently drop a malformed assignee_id and return a successful response.
Reject the parse failure inline at every entry point — Create, Update,
and BatchUpdateIssues — so the validator never sees an unparseable id.
Adds two regression tests covering the create and update paths.
Follow-up to #1686. Locks in two nits flagged during review:
1. agent.Result.Status doc comment now lists "cancelled" alongside the
existing values, so the enum surface matches actual usage.
2. New TestExecuteAndDrain_ContextCancelled_ReportsCancelled exercises
the path added in #1686: when the parent context is cancelled before
the backend produces a Result, executeAndDrain must return
Status="cancelled" (not "timeout"). A regression here would silently
restore the misleading log line we just fixed.
DeleteIssue passed the raw URL parameter through parseUUID(), which
returns a zero UUID for human-readable identifiers like "API-123".
This caused DELETE requests with identifier-style IDs to silently
succeed (204) without actually deleting the issue.
Use issue.ID from the already-resolved issue object instead, consistent
with BatchDeleteIssues and all other operations in the same handler.
Fixes#1661
When the server cancels a task (e.g. assignee changes during execution,
explicit user cancel, or workspace_isolation check fail), the daemon's
cancellation poll fires runCancel() on the run context. The drainCtx
derived from runCtx then signals Done(), but executeAndDrain() was
returning Status: "timeout" regardless of *why* the context ended.
The "agent finished status=timeout" log line is then misleading — it
suggests an actual deadline timeout when really the task was cancelled
by upstream. We spent hours misdiagnosing a healthy handoff as a
broken timeout because of this.
Distinguish context.Canceled from context.DeadlineExceeded in
executeAndDrain, and add a "cancelled" case to runTask so the status
propagates through the existing log path.
No behaviour change for genuine timeouts; no behaviour change for
the cancelled-by-poll discard path in handleTask. Only the daemon
log line and TaskResult.Status get the more accurate label.
PR #1632 updated the Pi project-level skill dir from
.pi/agent/skills/ to .pi/skills/, but missed two references:
- server/internal/daemon/execenv/runtime_config.go:20 — the comment
block here lists project-level paths for every other provider, so
using Pi's global path was inconsistent and misleading.
- docs/docs-rewrite-plan.md:88 — planning doc still listed the old
path in the Skills row.
Follow-up to #1632.
Mitigates #1637 and the related model-discovery failure in MUL-1397 by bounding the /api/daemon/heartbeat hot path with an ack-safe probe/claim split, adding structured slow-log attribution, and closing the ModelListStore running-state gap. See PR description for details.
Closes the functional gap the reporter hit on alchaincyf/huashu-design
(skills.sh/alchaincyf/huashu-design/huashu-design) without expanding
candidatePaths unconditionally, which would let an unrelated root
SKILL.md hijack a different skill URL in a multi-skill repo.
Try SKILL.md at the repo root before falling into the recursive tree
fallback added in #1432. Verify the frontmatter name matches the
requested skill so only genuine single-skill repos take the fast path.
For those repos this also shaves the recursive tree API call.
Also clarifies the candidate-path comment so the root case is
explicit.
Drop priority and project_id from autopilot. project_id was never exposed
in the UI and priority duplicated the agent's own task queue priority.
Redesign the create/edit modal as a Runbook (left) + Configuration (right)
layout. Rework the Schedule section around a single visual shell so every
picker aligns pixel-for-pixel on the same row:
- TimeInput (new): segmented HH:MM control adapted from openstatusHQ/time-picker,
driven by keyboard (ArrowUp/Down to step, ArrowLeft/Right to jump segment,
digit typing with a 2s two-digit window). Replaces <input type="time">,
whose native UI broke the design system. Supports a minuteOnly variant
for hourly schedules.
- TimezonePicker (new): searchable Popover with a fixed-width left check
slot so rows stay aligned and GMT offsets never collide with the selected
indicator.
- Runbook editor now lives in a bordered card, giving the placeholder an
input surface instead of bare document flow.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(daemon): harden agent mention-loop instructions
Two agents that mention each other via `mention://agent/<id>` can fall into
an infinite reply loop — each says "I'm done" in prose but keeps
`@mentioning` the other, which re-enqueues their run. Adding hard caps on
agent-to-agent turns conflicts with Multica's design principle of giving
agents the same authorship freedom as humans, so this change hardens the
instructions that the harness injects instead.
- Replace the terse "mentions are actions" blurb with a full Mentions
protocol: `side-effecting` warning, explicit "when NOT to mention"
(replying to another agent, sign-offs, thanks) and "when a mention IS
appropriate" (human escalation, first-time delegation, user asked).
- Add a pre-workflow decision step for comment-triggered runs: decide
whether a reply is warranted at all, decide whether to include any
`@mention`, and clarify that the post-a-comment rule is mandatory *if*
you reply — silence is a valid exit for agent-to-agent threads.
- Thread the triggering comment's author kind + display name
(`TriggerAuthorType` / `TriggerAuthorName`) from the claim endpoint
through the daemon task type, per-turn prompt, and CLAUDE.md workflow.
When the author is another agent, both surfaces now name that agent
and warn against sign-off mentions.
- Soften the old closing line that told agents to `always` use the
mention format — the word generalized to member/agent mentions and
encouraged the very behavior that causes loops.
Refs GH#1576, MUL-1323.
* fix(daemon): remove MUST-respond conflict and sanitize trigger author name
Addresses two blocking points on PR #1581:
1. buildCommentPrompt told the agent "You MUST respond to THIS comment"
and unconditionally appended the reply command — directly conflicting
with the new agent-to-agent silence-as-valid-exit workflow. Models
were likely to keep following the older must-reply rule and fall back
into the loop this PR is trying to close.
Rewrite the header as "Focus on THIS comment — do not confuse it
with previous ones" (keeps the anti-stale-comment signal) and change
BuildCommentReplyInstructions to open with "If you decide to reply,
post it by running exactly this command" so the reply command is
available but conditional across both prompt surfaces.
2. Raw agent/user display names were being embedded directly into the
high-priority prompt and CLAUDE.md via TriggerAuthorName. Agent and
member names are only validated as non-empty at write time, so a
name containing newlines, backticks, or fake mention markup would
turn the field into a cross-agent prompt-injection surface.
Add execenv.SanitizePromptField — strip control runes, collapse
whitespace, drop markdown structural characters (backtick, asterisk,
brackets, pipe, angle brackets, hash, backslash), truncate to 64
runes — and apply it at both embed sites (per-turn prompt and
CLAUDE.md). Defense-in-depth at the consumption layer so this works
for already-stored names without a migration.
Tests: TestSanitizePromptField covers the policy; TestBuildPromptSanitizesAgentName
plants an attack payload in TriggerAuthorName and checks the rendered prompt
does not leak the newline-anchored injection or the fake mention markup.
TestBuildPromptCommentTriggered*{,ByMember} updated to lock in the
conditional reply-command framing.
* refactor(daemon): trim redundant CLAUDE.md preamble and drop name sanitizer
Per PR #1581 feedback:
1. Remove the `if ctx.TriggerAuthorType == "agent"` preamble block in
runtime_config.go. It duplicated what workflow steps 4 and 5 already
say ("Decide whether a reply is warranted", "Never @mention the
agent you are replying to as a thank-you or sign-off"), so the
signal lands the same without the extra ~7 lines of CLAUDE.md. The
per-turn prompt preamble in prompt.go stays — that surface has no
numbered workflow below it and would otherwise lose the
silence-as-exit signal.
2. Delete execenv.SanitizePromptField + its test. Workspace agents are
created by trusted team members, so the cross-agent name-injection
surface it defended isn't realistic in the current trust model.
3. Drop TriggerAuthorType/Name from execenv.TaskContextForEnv and stop
populating them in daemon.go — they're no longer read by the
execenv package. The same fields on daemon.Task stay because
prompt.go still needs them to label the triggering author in the
per-turn prompt.
Tests simplified to match the leaner shape: CLAUDE.md regression
guards now assert that the anti-loop phrases live in the numbered
workflow, and the sanitizer-specific tests are removed.
* feat(daemon/gc): tighten GC defaults + flex duration suffix
Driven by user feedback in #1539 (40 GB VPS filling within 24h of heavy
AI-coding usage): the existing TTLs were sized for desktop/laptop
deployments and are too lenient for small-disk, long-running daemons.
- GCTTL: 5d → 24h. Done/canceled issues almost never need a multi-day
grace period in AI-coding workflows.
- GCOrphanTTL: 30d → 72h. Covers crash-leftover and pre-GC directories
without a month-long wait.
- Issue-deleted orphans (API returns 404) are now cleaned on the next GC
cycle regardless of mtime. The issue row is gone; there is nothing
left to protect.
- parseFlexDuration: accept a `d` (day) suffix in addition to the stdlib
time.ParseDuration syntax. MULTICA_GC_TTL=5d now works; previously only
120h was accepted.
* fix(daemon/gc): address review — 404 safety + decimal/overflow in duration parser
Two issues flagged in PR review:
1. 404-immediate-clean is unsafe. The /gc-check endpoint returns 404 for
both "issue deleted" AND "daemon token has no access to the workspace"
(anti-enumeration, see requireDaemonWorkspaceAccess). Clean-on-404
would let a scoped-down daemon token wipe taskDirs whose issues are
still live. Restore the mtime gate against GCOrphanTTL. With the new
72h default we still shrink the original 30d window dramatically
without the cross-workspace hazard. Lock the behavior in with a new
test that asserts a recent 404 is skipped.
2. parseFlexDuration mishandled decimals and swallowed Atoi errors:
"0.5d" → 7m12s (regex matched only the "5d"), "1.5d" → 1h7m12s,
and 20+ digit day values Atoi-errored silently to 0. Match the full
decimal number with `\d*\.\d+|\d+` and parse with ParseFloat so
fractional days and oversized inputs both go through
time.ParseDuration correctly — fractions as sub-hour durations,
overflow as a returned error.
Review follow-up on PR #1557: the server-side change started returning
500 when the store write failed, but the daemon's handleLocalSkillList /
handleLocalSkillImport were discarding the ReportLocalSkill*Result error
return. Net effect was a silent drop — the daemon moved on, the request
stayed in "running" on the server, and the user saw the same "daemon did
not respond within 30 seconds" timeout the store refactor was supposed
to kill.
Fix: route both report calls through reportLocalSkillResultWithRetry,
which retries on 5xx + network errors with 0 / 0.5s / 2s / 4s backoff
(total ~6.5s, well inside the 60s server-side running timeout), stops
on 4xx (request expired / cross-workspace rejection — retry won't help),
bails on context cancel, and logs Error on exhaustion so ops has a
footprint to grep for.
Tests (server/internal/daemon/local_skill_report_test.go, 6 new cases):
- 500 twice then success -> 3 attempts, second retry lands
- 404 -> exactly 1 attempt (permanent, no retry)
- import 502 then success -> 2 attempts
- All-500 -> burns through all backoff slots then gives up with ERROR log
- Context cancel mid-backoff -> exactly 1 attempt, cancellation logged
- Smoke: report paths hit /api/daemon/runtimes/<rt>/local-skills{,import}/<req>/result
localSkillReportBackoffs is var-assignable so tests can swap in zero-delay
schedules without paying real sleep latency.
* fix(skills): shared-state runtime local-skill stores (MUL-1288)
Fixes the bug Bohan surfaced on MUL-1288: behind prod's multi-node API the
runtime-local-skill list/import flow would intermittently time out or 404.
Root cause: LocalSkillListStore and LocalSkillImportStore were per-process
sync.Mutex+map, so when the frontend POST, the daemon heartbeat and the
frontend GET landed on different API instances, each saw a different
pending set. Confirmed against production daemon logs — the failed
request_id never showed up in the daemon's "runtime local skills
requested" log, even though other requests around the same window worked.
Per Yushen's guidance (server must stay stateless; state lives in
storage), migrate both stores to Redis so every node agrees on the same
pending set.
What changed
- LocalSkillListStore / LocalSkillImportStore are now interfaces. Methods
take context.Context and return error.
- InMemoryLocalSkill{List,Import}Store — renamed from the existing types,
kept as the default for single-node dev and the in-process test suite.
- RedisLocalSkill{List,Import}Store — new. Keyed on
mul:local_skill:{list,import}:<id> (JSON record, TTL = retention), with
a per-runtime ZSET mul:local_skill:{list,import}:pending:<runtime_id>
(score = created_at UnixNano) providing cross-node ordering. PopPending
wins the claim via ZREM == 1, so concurrent pops from different nodes
never return the same request twice.
- NewRouter gets an optional *redis.Client; when non-nil it swaps in the
Redis-backed stores. main.go hoists the existing Redis client (already
used by the realtime relay) so both subsystems share one client.
- Handler fields flip to interface types; handler.New still constructs
in-memory stores by default.
- Daemon heartbeat's PopPending call sites thread r.Context() through so
Redis operations inherit request cancellation. Errors warn instead of
poisoning the heartbeat response.
Tests
- Existing in-memory tests updated for the new signatures (ctx + error).
- New runtime_local_skills_redis_store_test.go covers:
- Create/Get/Complete round trip preserves skills payload
- PopPending across two *store instances sharing one rdb (the exact
regression: node A creates, node B pops)
- N concurrent PopPending on one record => exactly one winner
- Pending-timeout threshold transitions the record and removes the zset
member so a later PopPending doesn't return a timed-out request
- Import store round-trips CreatorID (which is json:"-" on the public
struct — needs a Redis envelope so ReportLocalSkillImportResult can
still attribute the created Skill)
- Per-runtime isolation — a PopPending for runtime B does not disturb
A's pending zset
- Tests skip gracefully if REDIS_TEST_URL is unset; CI now spins up a
redis:7-alpine service and exports the URL so the suite actually runs
there.
Out of scope
PingStore / UpdateStore / ModelListStore have the same shape and the
same latent bug (they just fire rarely enough to have gone unnoticed).
Migrating them to Redis is a follow-up — MUL-1288 is specifically the
local-skills break Bohan is blocked on.
* fix(skills): atomic Redis claim + surface store write failures (PR #1557 review)
Two real gaps GPT-Boy flagged:
1. RedisLocalSkill{List,Import}Store.PopPending was doing ZREM then SET as
two separate round-trips. If the SET failed for any reason — transient
Redis error, context cancellation, pod getting SIGKILL'd mid-call — the
request was already gone from the pending zset but the stored record
still said "pending", and no subsequent PopPending would re-dispatch
it. Exactly the "request disappears" class of bug this PR is supposed
to kill.
Fix: push the claim into a Lua script so Redis runs ZREM + SET as one
atomic unit. If ZREM returns 0 (another node won the race), SET is
skipped and the caller retries.
2. ReportLocalSkill{List,Import}Result handlers were logging Complete/Fail
store failures at Warn and still returning 200 OK. That made the
daemon think the report landed when it hadn't, leaving the request
stuck in "running" until the server-side timeout and — worse for the
import flow — leaving the just-created Skill row orphaned in Postgres
so every retry collided with the unique-name constraint.
Fix: escalate to Error + return 500 so the daemon (and monitoring) can
see the write failed. For the import flow, Complete failure after the
Skill row is already committed also triggers a best-effort DeleteSkill
so a daemon retry lands on a clean slate instead of hitting
"a skill with this name already exists" forever.
Tests
- New TestRedisLocalSkillListStore_PopPendingAtomicClaim asserts the
happy-path invariant: after one PopPending the record is "running"
AND a second PopPending returns nothing. Deliberately does NOT poke
Redis internals directly so the test survives any future key-layout
refactor.
- Existing cross-instance / concurrent / timeout / per-runtime tests
continue to pass against the Lua-based claim path (verified locally
against a scratch redis-server; 8/8 Redis tests green).
* feat(runtimes): remove Test Connection / runtime ping feature
The Test Connection action invoked a real single-turn agent run to verify
runtime connectivity. In practice it was expensive (reuses none of the
normal task exec env, so it also gave misleading results) and low value —
daemon heartbeat + Online status already covers the "is the runtime
alive" question. Dropping the whole end-to-end probe path:
- deletes server handler and in-memory PingStore
- drops pending_ping from the heartbeat response and daemon poll loop
- removes daemon.handlePing, PendingPing, ReportPingResult
- removes the CLI `multica runtime ping` command
- removes the PingSection UI block and RuntimePing types / api methods
* docs: fix runtime CLI subcommand list in product-overview
* feat(realtime): phase 0 — extract Broadcaster interface + add metrics
Phase 0 of the WebSocket horizontal-scaling plan tracked in MUL-1138.
This change is intentionally behavior-preserving: it sets up the seams
needed for later phases (subscribe/unsubscribe protocol, scope-level
fanout, Redis Streams relay) without altering any wire protocol or
producer call sites.
What changed
- New realtime.Broadcaster interface covering the three fanout methods
producers already use on *Hub (BroadcastToWorkspace, SendToUser,
Broadcast). *Hub continues to satisfy it; a future Redis-backed
implementation can be dropped in without touching listeners.
- registerListeners now depends on realtime.Broadcaster instead of
*realtime.Hub, isolating the bus → realtime fanout layer behind an
interface.
- New realtime.Metrics singleton with atomic counters: connects,
disconnects, active connections, slow-client evictions, total
messages sent/dropped, and per-event-type send counters. Wired into
Hub register/unregister/broadcast paths and into every listener.
- New GET /health/realtime endpoint returning a JSON snapshot of the
metrics so we can observe baseline fanout pressure before phase 1.
Why phase 0 first
GPT-Boy's only-Redis plan and CC-Girl's review both call out the same
prerequisite: get a Broadcaster seam and visibility in place before
introducing scope-level subscriptions or a Redis relay. Doing this as
a standalone step keeps each later PR focused and trivially revertable.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat(realtime): only-Redis fanout — scopes, subscribe protocol, Redis Streams relay (MUL-1138)
Implements the final-version plan agreed in MUL-1138 on top of phase 0:
* Hub: 4 scope types (workspace/user/task/chat), per-client subscription
set, subscribe/unsubscribe WS frames, ScopeAuthorizer hook for
task/chat scope auth, first/last-subscriber callbacks for the relay,
workspace+user auto-subscribe on connect.
* RedisRelay: Broadcaster impl that XADDs every event into
ws:scope:{type}:{id}:stream and XREADGROUPs only the scopes for which
this node has live subscribers. Per-node consumer group, heartbeat,
stale-consumer sweeper, MAXLEN cap, lag/disconnect metrics.
* Listeners: route task:* events to ScopeTask, chat:* events to
ScopeChat; workspace remains the default for everything else.
* events.Event: optional TaskID / ChatSessionID hints so the listener
layer can pick the right scope without re-parsing payloads.
* Handler: publishTask / publishChat helpers; chat + task message
publishers updated to use them.
* main.go: when REDIS_URL is set, wrap the hub with NewRedisRelay and
pass the relay (instead of the hub) to registerListeners. A
db-backed ScopeAuthorizer enforces that task/chat subscribes belong
to the caller's workspace.
* Metrics: per-scope subscribe/deny counters, redis connect state, node
id, lag/dropped counters surfaced via /health/realtime.
Behavior in single-node mode (REDIS_URL unset) is unchanged.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(realtime): address PR #1429 review must-fix items (MUL-1138)
- listeners: keep task/chat events on workspace fanout until the WS
client supports scope-subscribe + reconnect-replay. Routing them
through BroadcastToScope today (without any client subscriber) would
silently drop every chat / task message and break the live timeline,
chat unread badges, and pending-task UI. The server-side scope infra
(Hub subscribe/unsubscribe, ScopeAuthorizer, Redis Streams relay)
stays in place so flipping the switch in the client follow-up PR is
a one-line change.
- scope_authorizer: ScopeChat now enforces CreatorID == userID, mirroring
the HTTP layer (handler/chat.go: GetChatSession / SendChatMessage /
MarkChatSessionRead). Without this, any workspace member who learned a
session_id could subscribe to chat:message / chat:done /
chat:session_read for a peer's private chat. The same creator-only
check is applied to ScopeTask when the task is a chat task
(task.ChatSessionID set). Issue tasks remain workspace-scoped.
- Refactor scope authorizer to depend on a narrow scopeAuthQuerier
interface so its decisions can be unit-tested without a live DB.
- Add tests:
* listeners_scope_test.go pins the workspace-fanout fallback for
task:message / task:progress / chat:message / chat:done /
chat:session_read.
* scope_authorizer_test.go covers chat creator-only access, chat-task
creator-only access, and issue-task workspace-only access (creator
allowed, peer denied, cross-workspace denied, missing session
denied, empty userID denied).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: CC-Girl <cc-girl@multica.ai>
* fix: pass model to Hermes ACP session/new and add hermes to InjectRuntimeConfig
- hermes.go: include opts.Model in session/new params so Hermes uses
the configured model instead of its default (fixes local LLM failures)
- runtime_config.go: add "hermes" to the AGENTS.md provider list so
Hermes receives the Multica runtime instructions and skill discovery
Fixes: https://github.com/multica-ai/multica/issues/1195
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(hermes): drop false native-skill claim and add regression tests
The previous change added 'hermes' to the 'skills discovered automatically'
branch of buildMetaSkillContent, but resolveSkillsDir has no Hermes case so
skills still land in the .agent_context/skills/ fallback. AGENTS.md ended up
claiming native discovery while the files were somewhere else, which would
mislead Hermes (and future debuggers).
- Move 'hermes' to the fallback branch alongside 'gemini' so AGENTS.md points
Hermes at .agent_context/skills/ — matching where writeContextFiles actually
writes them.
- Extract buildHermesSessionParams so the session/new payload is unit-testable.
- Add regression tests covering:
* buildHermesSessionParams includes/omits 'model' correctly
* InjectRuntimeConfig('hermes') writes AGENTS.md with the fallback hint
* writeContextFiles('hermes') writes skills to .agent_context/skills/
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: CC-Girl <cc-girl@multica.ai>
* feat(feedback): add in-app feedback flow and Help launcher
Replaces the duplicated bottom-sidebar user popover and "What's new" links
with a single Help menu (Docs / Feedback / Change log) pinned to the
sidebar footer. Feedback opens a rich-text modal that POSTs to a new
/api/feedback endpoint; submissions land in a dedicated feedback table
with per-user hourly rate limiting (10/hr) to deter spam without adding
middleware infrastructure. User identity (avatar + name + email) moves
into the workspace dropdown header so the sidebar is no longer visually
redundant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(feedback): harden submit path and cap request body
- Read editor markdown via ref at submit time instead of debounced state,
so ⌘+Enter immediately after typing doesn't drop the last keystrokes.
- Block submission while images are still uploading; toast prompts the
user to wait instead of silently sending markdown with blob: URLs
that get stripped.
- Cap /api/feedback request body at 64 KiB via MaxBytesReader so an
authenticated client can't bloat the metadata JSONB column with an
oversized url field.
- Add Go handler tests covering happy path, empty-message rejection,
and the hourly rate limit boundary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(analytics): instrument feedback funnel
Adds two events pairing frontend intent with backend conversion so we
can compute a completion rate for the in-app Feedback modal:
- `feedback_opened` (frontend) — fires once on FeedbackModal mount.
Source is currently always "help_menu" but the type is a union so
future entry points have to extend it explicitly. Workspace id is
attached when present.
- `feedback_submitted` (backend) — fires from CreateFeedback after the
DB insert succeeds and the hourly rate-limit check has passed.
Message content itself is never sent to PostHog; the event carries
a coarse length bucket (0-100 / 100-500 / 500-2000 / 2000+), an
image-presence flag, and the client platform / version pulled from
X-Client-* headers via middleware.ClientMetadataFromContext.
Affects no existing funnel; seeds a new Feedback funnel for product
triage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a doc comment on GetConfig spelling out that the endpoint is mounted on
the unauthenticated route group (so the login page can fetch GoogleClientID /
AllowSignup before the user is signed in) and that only instance-level public
fields may be added. Prevents accidentally returning user- or tenant-scoped
data from this handler in the future.
Follow-up to #1453. That PR fixed the Tasks tab crash by filtering empty
issue_id out of the detail lookup and rendering a neutral "Task without
linked issue" label, but every issue-less task — chat-spawned or
autopilot-spawned — looked the same. The server already stores the
origin in `agent_task_queue.chat_session_id` / `autopilot_run_id`; only
the HTTP serializer was dropping them.
Server:
- `taskToResponse` now populates `ChatSessionID` and the new
`AutopilotRunID` on `AgentTaskResponse`. Backward compatible: both
omit when UUID is invalid, and existing clients ignore unknown
fields.
Types:
- `AgentTask` (TS) gains `chat_session_id?` + `autopilot_run_id?` and a
comment clarifying when `issue_id` is empty.
Tasks tab:
- Row label for issue-less tasks is picked from the populated source
field: "Chat session" for chat tasks, "Autopilot run" for autopilot
tasks, "Task without linked issue" as the neutral fallback. Rows stay
inert (no anchor) in all three cases; existing issue-linked path is
unchanged.
Tests:
- Two new regression tests assert the chat and autopilot labels render
correctly and neither row becomes an anchor. Existing neutral-label
test stays as the "neither source populated" case.
Publish stable GHCR self-host images, switch self-host deploys to official image pulls with a source-build fallback, and move self-host signup / Google OAuth config onto runtime /api/config.
* feat(analytics): capture onboarding funnel events + person-property $set
Closes the visibility gap introduced by the Onboarding relaunch: the
five new steps between signup and workspace_created were invisible to
PostHog, and we couldn't see Step 3 web-fork drop-off, cloud waitlist
intent, or starter-content acceptance at all.
Server-side events (see docs/analytics.md for full contracts):
- onboarding_questionnaire_submitted — fires once when all three
answers first land; also $set's role/use_case/team_size on the
person so every subsequent event is cohortable
- agent_created — not onboarding-specific; is_first_agent_in_workspace
isolates the Step 4 signal
- onboarding_completed — fires on the actual NULL → timestamp flip
with completion_path (full / runtime_skipped / cloud_waitlist /
skip_existing / unknown) + joined_cloud_waitlist
- cloud_waitlist_joined — sizes hosted-runtime interest
- starter_content_decided — imported vs dismissed, split by
agent_guided / self_serve branch on both sides
Also adds Event.Set (→ PostHog $set) alongside the existing SetOnce so
the same events can carry mutable cohort signals without a separate
identify round-trip.
* feat(analytics): wire frontend onboarding events + completion_path
- captureEvent / setPersonProperties helpers in @multica/core/analytics,
with the same pre-init buffering as identify/pageview so config races
don't drop step transitions
- onboarding_runtime_path_selected fires from step-platform-fork for
the three web-fork choices (download desktop / CLI / cloud waitlist),
plus platform_preference on person properties for downstream splits
- completeOnboarding now takes an OnboardingCompletionPath; the
onboarding shell derives full / runtime_skipped / cloud_waitlist
from runtime + waitlist state (lifted to the shell so StepFirstIssue
can see both), and handleWelcomeSkip passes skip_existing
- saveQuestionnaire mirrors team_size/role/use_case into person
properties via $set so every event on this user becomes cohortable
- StepAgent sends the template slug, StarterContentPrompt passes
workspace_id on dismiss so the server can mirror the branch label
* docs(analytics): document onboarding funnel events + $set person properties
`ListPins` used to join `issues` / `projects` so each pin row carried a
`title`, `status`, `identifier`, and `icon`. Convenient for the sidebar
but architecturally wrong: those fields live on a different cache key
than the pin query, so an `issue:updated` WS event invalidates
`issueKeys` and never touches `pinKeys`. The sidebar therefore showed
stale issue status / titles on pinned rows until a hard refresh —
and the same shape would silently re-emerge for any new enriched
field added later.
This refactor moves the join to the client so display data flows from
its real source of truth:
Server (`server/internal/handler/pin.go`):
- `PinnedItemResponse` keeps only pin-owned columns (id, workspace_id,
user_id, item_type, item_id, position, created_at).
- `ListPins` no longer fetches issues / projects in the loop and no
longer hides orphaned pins; the client decides how to render a pin
whose target was deleted.
- `formatIdentifier` helper deleted (was only used by the enrichment
branch); `strconv` import dropped along with it.
Types (`packages/core/types/pin.ts`):
- `PinnedItem` interface now mirrors the bare server shape. The four
enriched fields are removed.
Sidebar (`packages/views/layout/app-sidebar.tsx`):
- New smart wrapper `PinRow` resolves each pin's display data via
`useQuery(issueDetailOptions(...))` or `useQuery(projectDetailOptions(...))`
with `enabled` gates on `pin.item_type` so the hook order stays
stable. Loading renders a flat skeleton; error / 404 renders null
(orphan pins hide themselves).
- `SortablePinItem` becomes purely presentational: it now takes
`label` and `iconNode` as props instead of reading them off the pin
object. dnd-kit / navigation wiring untouched.
- Same pattern as `packages/views/search/search-command.tsx:151`,
which already uses per-row detail queries for Recent issues.
WS sync layer is unchanged: `onIssueUpdated` already patches
`issueKeys.detail`, so changing an issue's status now flows directly
into the sidebar without any cross-entity invalidate. The `pin:*`
prefix handler still invalidates `pinKeys` for create / delete /
reorder — that's still the correct signal for the pin LIST itself.
Verified: views typecheck + core typecheck + web typecheck +
desktop typecheck + go test ./internal/handler/... + vitest
(views: 165 tests, core: 83 tests) all pass.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(slugs): reserve homepage + expand reserved slug list (MUL-961)
- Fix: `homepage` was a live `/homepage` landing route in apps/web but not
in the reserved list, so a user could register a workspace slug that
shadowed the landing page. Now reserved on both backend and frontend.
- Add likely-future global routes (home, dashboard, profile, account,
billing, notifications, search, members) so we don't have to do another
audit/rename pass when these get wired up.
- Add API/ops prefixes (v1, v2, graphql, webhooks, sdk, tokens, cli,
health, ws, metrics, ping) as defense-in-depth against collision with
API aliases and ops endpoints.
- Clarify in both source files that the dotted/underscored entries in the
"Next.js / web standards" section are currently unreachable under the
slug regex `^[a-z0-9]+(?:-[a-z0-9]+)*$` and are kept as defense-in-depth
in case the regex is ever relaxed.
- Add audit migration 056 following the 047/049 pattern to fail loud if
any production workspace slug collides with the newly reserved set.
* fix(slugs): rename prod conflicts in migration 056 (home → home-1, dashboard → dashboard-1)
Per db-boy's prod audit in the MUL-961 thread, two §3 slugs had live prod
workspaces at reservation time. Decision on MUL-961: force-rename both in
the audit migration (scheme 1), same playbook as MUL-972 for admin/multica/
new/www.
- `home` → `home-1` (68a982da, zzlye, 2026-04-14)
- `dashboard` → `dashboard-1` (ea5a332f, 王争, 2026-04-22)
Targeted UPDATEs land first, followed by a generic `<slug>-N` fallback that
handles any row that slips in between the audit snapshot and deploy. A
post-condition block re-queries the reserved set and fails loud if anything
slipped through.
Down migration reverts the two targeted renames deterministically (they're
keyed by workspace_id, so rollback is safe).
Owner outreach (email zzlye@ + 王争@ about the URL change) is tracked as a
follow-up outside this PR.