Commit Graph

3738 Commits

Author SHA1 Message Date
Bohan Jiang
189f95fabb docs: tidy agent runtime provider pages, add per-runtime FAQ (MUL-3617) (#4500)
* docs: tidy agent runtime provider pages, add per-runtime FAQ (MUL-3617)

- Remove the Gemini CLI provider from install-agent-runtime and providers
  across all four languages (Google folded the standalone CLI into
  Antigravity). Update tool counts 12 -> 11 and the dependent
  session-resumption, MCP, and skill-path sections.
- Add the Hermes profile custom_args workaround as a per-runtime FAQ note
  under providers#hermes (supersedes #4497, which placed it in agents-create).
- Fix stale Japanese install copy that claimed only Claude Code reads
  mcp_config and linked to a non-existent anchor.

Co-authored-by: multica-agent <github@multica.ai>

* docs: add Qoder and CodeBuddy runtimes to provider pages (MUL-3617)

Document the two newly added runtimes on install-agent-runtime and
providers across all four languages:

- Qoder (Alibaba): ACP-over-stdio CLI `qodercli`, shares the transport
  with Hermes/Kimi/Kiro; session/resume, ACP mcpServers, dynamic model
  discovery, native skills at .qoder/skills/.
- CodeBuddy (Tencent): Claude Code-compatible CLI `codebuddy`, driven via
  stream-json; --resume, --mcp-config, dynamic models, .claude/skills/.

Update tool counts 11 -> 13 and the MCP section (now ten of thirteen
consume mcp_config; the other three still ignore it).

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 14:35:26 +08:00
Multica Eve
8ad673fdb7 MUL-3560: gate slim runtime brief behind runtime_brief_slim feature flag (#4449)
The MUL-3560 slim runtime brief — kind-driven dispatcher, per-section
gating, prose compression for ~7k chars saved on the typical
comment-triggered task — now ships behind the `runtime_brief_slim`
feature flag wired via the framework-level service from MUL-3615.

Default: OFF in every environment (production stays on the legacy
brief that has shipped for ~2 years). Staging opts in via the YAML
rule set; ops can override per-process with `FF_RUNTIME_BRIEF_SLIM=true`.
Production is held back until staging has burned in long enough that
we are confident the slim brief does not regress agent behaviour.

Architecture (one toggle point, two code paths, both fully tested):

  buildMetaSkillContent (runtime_config.go)
      │
      └─ useSlimBrief() → false (default)
      │     → fall through to the legacy verbose body that ships on
      │       main today — byte-for-byte unchanged, no migration risk
      │
      └─ useSlimBrief() → true
            → buildMetaSkillContentSlim (runtime_config_sections.go)
              → classifyTask → 5-way kind switch → per-section writers

BuildCommentReplyInstructions takes the same gate, so the per-turn
comment prompt and the runtime brief stay in sync on which template
they emit.

What's in this PR:

- runtime_config_flag.go (new): package-scope `runtimeFlags` atomic
  pointer + `SetFeatureFlags` setter + `useSlimBrief` toggle point.
  Nil-safe: a daemon that forgets to wire the service falls back to
  legacy, no panic.
- runtime_config_kind.go (new): `taskKind` enum + `classifyTask` +
  `hasIssueContext` predicate. Used only by the slim path.
- runtime_config_sections.go (new): the slim brief itself —
  `buildMetaSkillContentSlim` + per-section `writeXxx` helpers
  + `writeAvailableCommandsQuickCreate` minimal variant +
  `writeBackgroundTaskSafetySlim` compressed safety section. The
  Section × Kind matrix is documented inline on
  `buildMetaSkillContentSlim` and the test below checks the
  dispatcher does not diverge from the spec.
- reply_instructions.go: `BuildCommentReplyInstructions` gains a
  short slim-or-legacy prelude; new `buildCommentReplyInstructionsSlim`
  is the compressed cookbook (defers the shell-hazard rationale to
  `## Comment Formatting`).
- runtime_config.go: `buildMetaSkillContent` gains a 2-line
  dispatcher at the top; the legacy body is otherwise untouched.
- runtime_config_kind_test.go (new): canaries for both paths.
  - TestClassifyTask: 5 kinds + 3 tiebreak cases.
  - TestTaskKindHasIssueContext: predicate semantics.
  - TestSlimFlagOffUsesLegacy: nil flag service → legacy path
    (renders "Get full issue details.", a legacy-only substring).
  - TestSlimFlagOnUsesSlim: flag on → slim path (renders "full
    issue.", a slim-only one-liner) AND must NOT render legacy
    "Get full issue details.".
  - TestBuildMetaSkillContentSlimKindMatrix: locks the per-kind
    section set; heading match is line-anchored so inline references
    don't trip absence assertions.
  - TestSlimQuickCreateAvailableCommands: locks the minimal-variant
    content for quick-create (issue create present, every other
    Core command absent).
  - TestSlimBriefIsSubstantiallyShorter: ≥ 30% reduction guard so
    a future change can't accidentally re-bloat the slim path back
    to legacy levels.
- cmd/server/main.go: now calls `execenv.SetFeatureFlags(flags)`
  immediately after constructing the feature flag service.

Measured impact (slim vs legacy, claude provider, realistic fixture
with 2 repos + 2 skills + member initiator):

    legacy = 19567 chars
    slim   = 11868 chars    Δ = -7699  (-39.3%)

Verification:

- go vet ./internal/daemon/... ./cmd/server/...                  ok
- go test ./internal/daemon/...                                  ok
- go test ./pkg/featureflag/...                                  ok
- TestSlimBriefIsSubstantiallyShorter logs the 39.3% ratio
- TestSlimFlagOffUsesLegacy + TestSlimFlagOnUsesSlim pass both
  directions, so the dispatcher is locked in code.

The pre-existing `internal/handler` test failures
(TestLeaveWorkspace_RevokesOwnRuntimes,
TestDeleteMember_CancelsTasksFromAgentReassignment,
TestDeleteMember_NoRuntimes_DeletesMember) reproduce on plain
`origin/main` with the same `relation "channel_user_binding" does
not exist` SQL error — they are a missing-migration bug from the
recent channels foundation PR (ce28d0aa0), not anything this PR
touched.

Rollout plan:

1. Merge this PR. Production daemons keep emitting the legacy brief
   (flag default false).
2. Add a YAML rule to staging's
   `MULTICA_FEATURE_FLAGS_FILE`:

       runtime_brief_slim:
         default: true

   Staging daemons start emitting the slim brief on next restart.
3. Watch `agent prompt prepared` logs + agent behaviour for 7 days.
4. If staging is clean, flip the prod YAML to `default: true`.
   Legacy code path stays in the binary as a kill-switch
   (`FF_RUNTIME_BRIEF_SLIM=false` to revert without a deploy).
5. After ~30 days clean in prod, follow up with a PR that deletes
   the legacy body and the flag — same pattern as docs/feature-flags.md
   recommends ("plan the death of the flag at birth").

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 14:23:17 +08:00
Wilson-G
b92e4a53fb DH-106 为飞书接入补上 /new 会话指令 (MUL-3503) (#4396)
Lark/飞书入站消息新增 /new 首行指令,解析为 force_fresh_session,复用既有 daemon 会话续接门控。

Co-authored-by: Wilson-G <Wilson-G@users.noreply.github.com>
2026-06-24 14:16:22 +08:00
Multica Eve
4a8210912a feat(featureflag): framework-level feature flag system (MUL-3615) (#4496)
* feat(featureflag): framework-level feature flag system (MUL-3615)

Introduces a reusable feature flag framework so future features can adopt
flags without writing infrastructure code.

Backend: server/pkg/featureflag (Go)
- Service / Provider / Decision separation per Martin Fowler's Toggle
  Point / Toggle Router / Toggle Configuration pattern.
- Providers: StaticProvider (rules in source control), EnvProvider
  (FF_<KEY> overrides for ops kill switches), ChainProvider
  (first-hit-wins composition).
- EvalContext carried through context.Context with WithEvalContext /
  EvalContextFrom; supports user_id, workspace_id, free-form attributes.
- PercentRollout via deterministic FNV-1a bucketing; same user always
  lands in the same bucket so experiments do not flap between requests.
- Nil-safe Service: a nil *Service or missing flag returns the caller's
  default so business code never panics on a missing flag.
- 100% unit-test coverage with -race; go vet clean.

Frontend: packages/core/feature-flags (TypeScript)
- Same vocabulary as the Go side (Decision, EvalContext, Rule,
  PercentRollout). FNV-1a parity ensures cross-tier bucket agreement.
- FeatureFlagService + StaticProvider + ChainProvider in pure TS.
- React glue: FeatureFlagsProvider, useFlag(key, default),
  useVariant(key, default). Hooks fall back to the default when no
  provider is mounted so Storybook / unit tests stay simple.
- Vitest tests for service, providers, hash, and React hooks.

Docs: docs/feature-flags.md — wiring, EvalContext, toggle points,
backend-protection note, and the standard best-practice checklist.

The framework intentionally has no third-party Go deps and no API
surface beyond what real callers will need. New providers (DB, remote
config, LaunchDarkly) plug in by implementing Provider; no existing
caller has to change.

Co-authored-by: multica-agent <github@multica.ai>

* fix(featureflag): cross-tier hash parity + variant only when enabled (MUL-3615)

Two must-fix issues from the PR review on #4496:

1. TS hash had a trailing zero separator that Go did not emit, so the
   same (key, identifier) bucketed differently on the two tiers. The
   "user lands in the same bucket on server and client" promise was
   broken. For example billing_new_invoice/user-42 was bucket 97 in Go
   and bucket 11 in TS.

   Fix: TS fnv1a now emits the zero separator BETWEEN parts only, never
   after the last one, matching Go's hash.Write byte stream exactly.
   Verified by parallel golden tests on both sides that pin five
   (key, identifier) -> bucket triples; if either side drifts both tests
   fail and one must be brought back in sync.

2. StaticProvider returned `Rule.Variant` regardless of whether the rule
   evaluated to enabled=true. A 0%-rollout user, a deny-listed user, or
   a default-off user would see variant="experiment-v2", so callers
   branching on Variant() would route control users into the experiment
   arm.

   Fix: Rule.Variant is now the ON-variant only. When the rule evaluates
   to enabled=false the Decision's variant is the canonical "off",
   regardless of what Rule.Variant says. Documented as a behavior
   contract in the Rule godoc / JSDoc and covered by regression tests
   on both sides.

Tests: - go test -race ./pkg/featureflag/...  : all green (1.58s).
  - pnpm --filter @multica/core test     : 661/661 (3 new).
  - pnpm --filter @multica/core typecheck: clean.
Co-authored-by: multica-agent <github@multica.ai>

* fix(featureflag): hash UTF-8 bytes on the TS side for cross-tier parity (MUL-3615)

Follow-up review on PR #4496 caught that the previous hash fix was only
correct for ASCII input. The TS side used `charCodeAt`, which returns
UTF-16 code units, while the Go side hashes the UTF-8 byte
representation. Any non-ASCII flag key or identifier — Chinese flag
names, accented user IDs, emoji — would bucket differently on backend
vs frontend, silently breaking the "same user, same bucket" promise the
PR description makes.

Concretely:
  flag/é         Go 53  vs TS-old 68
  flag/🦄        Go 82  vs TS-old 75
  实验/user-1    Go 90  vs TS-old 4
  flag/用户-1    Go 95  vs TS-old 2

Fix: replace per-char charCodeAt with a module-level `TextEncoder`
('utf-8') and hash each encoded byte. After the fix all four cases above
match Go exactly, and the existing ASCII cases continue to match.

The cross-language golden tables on both sides now include the 5 new
non-ASCII cases alongside the 5 ASCII cases, so any future regression
that swaps UTF-8 for charCodeAt (or vice versa) will fail loudly on
both Go and TS simultaneously.

TextEncoder is part of WHATWG Encoding and is available in every
evergreen browser, in Node 11+, and in Hermes (React Native) >= 0.74,
which covers every runtime that imports @multica/core/feature-flags.

Tests: - go test -race ./pkg/featureflag/...   : all green.
  - pnpm --filter @multica/core test      : 661/661.
  - pnpm --filter @multica/core typecheck : clean.
Co-authored-by: multica-agent <github@multica.ai>

* feat(featureflag): wire into main app config — YAML file + env override (MUL-3615)

Follow-up requested by Yushen on PR #4496: make the feature flag
framework configurable through the existing main-program config system
instead of requiring Go code edits. multica's main app is purely env-var
driven (see .env.example) with optional MULTICA_*_FILE knobs for richer
config; feature flags now follow the same pattern.

server/pkg/featureflag/config.go
  - LoadRulesFromYAMLFile(path) parses a YAML rule set into runtime
    Rule structs. Empty files are a valid "no flags yet" state; missing
    or malformed files surface a hard error so operators see misconfig
    the same way DATABASE_URL parse errors do.
  - NewServiceFromEnv composes the standard provider chain:
      1. EnvProvider("FF_")               (runtime kill-switch path)
      2. StaticProvider from YAML file    (declarative rule set)
    When MULTICA_FEATURE_FLAGS_FILE is unset, only the env layer is
    active and every IsEnabled call falls through to the caller's
    default, so the server can boot before any flag is authored.

server/cmd/server/main.go
  - Construct the Service once at startup right after env-var warnings,
    fail loudly on malformed YAML, log the loaded rule count via the
    Service logger. The Service is held in a local `flags` variable
    ready to be threaded into handler.Handler / service constructors
    when the first flag user lands. Threading is deferred to the PR
    that adds the first business consumer so this PR stays a pure
    framework + config layer.

.env.example
  - New "Feature flags" section documents MULTICA_FEATURE_FLAGS_FILE and
    the FF_<KEY> override convention, with a minimal YAML schema example
    inline.

docs/feature-flags.md
  - Replace the "build a provider manually" example with the
    NewServiceFromEnv pattern that now matches what main.go actually
    does. Show the YAML schema in one place. Note the on-variant /
    off semantics from the previous review round.

server/pkg/featureflag/doc.go
  - Update package doc to mention the gopkg.in/yaml.v3 dependency
    (already a server-level dep) instead of the now-inaccurate
    "no third-party dependencies" claim.

Tests: - go test -race -count=1 ./pkg/featureflag/...   all green; new
    config_test.go covers: simple YAML, full-shape YAML, empty file,
    missing file, malformed YAML, no env var, file-only, env-beats-file,
    bad file surfaces error.
  - go test -race -count=1 -run TestHealth ./cmd/server/...   sanity
    check that the main.go boot path with the new wiring still passes.
  - go vet ./...   clean.
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 13:49:59 +08:00
Naiyuan Qing
a7908e6967 fix(issues): sync header agent chip with execution log via shared query (#4498)
The header live chip derived its active-task state from the workspace-wide
agent-task-snapshot, while the right-panel Execution log read the per-issue
task list. Two queries, two endpoints, two independent refetches: the heavier
workspace snapshot lands later than the per-issue list, so the log could show
a running task while the header chip had not started yet.

Point the chip at the same `issueKeys.tasks(issueId)` cache the Execution log
uses (identical query options). Both surfaces now observe one cache entry and
update atomically. Drop the now-redundant workspace-id lookup and client-side
issue_id filter, since the endpoint is already issue-scoped.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 13:26:50 +08:00
Multica Eve
00b9668cd2 fix(autopilot): cold-start planner honors trigger.last_fired_at (MUL-3551) (#4495)
Post-deploy of the new scheduled-dispatch scheduler (PR #4444), an
autopilot configured for "weekdays 17:10 Asia/Shanghai" fired at
~12:30 Beijing the day after deploy — ~4h 38m before the next
scheduled time the UI showed. Traced to a cold-start regression in
the planner hook:

Old behaviour
-------------
On the first tick after migration the hook found no
`sys_cron_executions` row for the trigger
(`latestPlan(...).Found == false`) and anchored on the trigger's
`created_at`, then applied the 24h replay cap:

  after := cfg.CreatedAt
  if oldest := now.Add(-replayWindow); after.Before(oldest) {
      after = oldest // now - 24h
  }

For a trigger created days/weeks earlier and last fired by the
legacy goroutine at Mon 17:10 Beijing (= Mon 09:10 UTC), this set
`after = Tue 04:13 UTC - 24h ≈ Mon 04:13 UTC`. The half-open
enumeration `(Mon 04:13 UTC, Tue 04:13 UTC]` STILL contained Mon
09:10 UTC — the occurrence the legacy code had already handled —
so the new scheduler dispatched it again the moment it took over.
The result: a SCHEDULED-source autopilot_run with planned_at = Mon
17:10 Beijing but a wall-clock dispatch at Tue ~12:30 Beijing.

Timezone math was correct; the bug was purely the cold-start
anchor not respecting prior-fire history.

Fix

Co-authored-by: multica-agent <github@multica.ai>
---
The `autopilot_trigger.last_fired_at` column is maintained by both
the legacy goroutine and the new scheduler (via
TouchAutopilotTriggerFiredAt), so it is the authoritative
"most-recent successful fire" cursor across the migration boundary.
The planner hook now anchors cold-start enumeration on it:

  case latest.Found:        after = latest.PlanTime
  case lastFiredAt != zero: after = lastFiredAt
  default:                  after = cfg.CreatedAt

For the regressed case, `after = Mon 17:10 Beijing`, the next
enumeration window is `(Mon 17:10, Tue 12:30]`, and Tue 17:10 is
in the future — the hook returns nothing and the trigger waits
quietly for Tue 17:10 as the UI promised. For brand-new triggers
(last_fired_at NULL), the original `created_at` path still
applies. For long-dormant triggers the `replayWindow` cap remains.

Changes
-------
* `ListSchedulableAutopilotTriggers` SQL now returns
  `last_fired_at`.
* `autopilotTriggerConfig.LastFiredAt` is populated by the scope
  provider on every tick.
* `autopilotPlansForScope` cold-start branch uses the new anchor.

Tests
-----
* TestAutopilotScheduleJobColdStartHonorsLastFiredAt — seeds the
  exact dev-environment shape (created 3 days ago, last_fired_at
  5 hours ago, no sys_cron_executions row), runs a tick, asserts
  zero exec rows AND zero autopilot_run rows. Without the fix this
  test produces one of each at a historical plan_time.
* TestAutopilotScheduleJobColdStartBrandNewTriggerStillFires —
  asserts a brand-new trigger (last_fired_at NULL) still fires its
  first due occurrence on cold start.

All existing `TestAutopilotScheduleJob*` tests still pass.

Refs MUL-3551

Co-authored-by: Eve <eve@multica-ai.local>
2026-06-24 13:01:59 +08:00
Bohan Jiang
ce28d0aa0e feat(integrations): add platform-agnostic channel foundation (MUL-3515) (#4412)
* feat(integrations): add platform-agnostic channel foundation

Introduce server/internal/integrations/channel — the contract every
inbound IM integration implements, so the core never learns a platform's
event JSON. Four pieces:

- Channel interface (Type/Connect/Disconnect/Send/Capabilities) + Factory
  + Config (channel_type + opaque JSON blob, maps to channel_installation).
- Normalized InboundMessage/OutboundMessage envelopes + Source/MediaRef/
  ReplyCtx/MsgType/ChatType. Envelope holds only cross-platform-true
  fields; platform specifics live in Raw, read only by the adapter.
- Capability bitmask: declaration only, no degrade logic in core.
- Registry: Type->Factory map, last-writer-wins, concurrency-safe.

Pure package (no DB/network/platform deps). Foundation for MUL-3515; the
lark cutover + lark_*->channel_* generalization land in follow-up PRs.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* feat(channel): generalize lark_* tables into channel_* (DB layer)

Migration 123 creates channel_installation / channel_user_binding /
channel_chat_session_binding / channel_inbound_message_dedup /
channel_inbound_audit / channel_outbound_card_message /
channel_binding_token. Each carries a channel_type discriminator and a
JSONB config for platform-specific identifiers/credentials; cross-platform
columns stay flat. Existing Feishu rows are backfilled (channel_type=
'feishu', app_secret_encrypted via base64). NO foreign keys / cascades
(MUL-3515 §4) — integrity moves to the app layer in the cutover.

queries/channel.sql ports the lark query surface to channel_*, JSONB-aware,
plus DeleteChannelUserBindingsByWorkspaceMember /
DeleteChannelChatSessionBindingBySession for the app-layer cleanup that
replaces the removed cascades.

lark_* tables/queries are left in place here and removed once the Go
cutover lands, so this commit ships green on its own.

Verified: sqlc generate, go build ./..., full migrate chain (1..123) on
Postgres 17, and a real-data backfill spot-check (base64 round-trip,
NULL-strip, functional unique index on (channel_type, app_id)).

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): name app_id query param + multi-IM install key + null-safe binding merge

Addresses review on MUL-3515 (PR #4412):

- GetChannelInstallationByAppID: explicitly name params and cast app_id to
  ::text so sqlc emits AppID string. A bare $2 next to `config ->> 'app_id'`
  was mis-attributed to the JSONB config column, generating Config []byte.

- channel_installation uniqueness -> (workspace_id, agent_id, channel_type),
  with the UpsertChannelInstallation conflict key matched. Lets one agent
  hold one installation per IM (feishu + slack + ...) instead of a later
  install clobbering an earlier one. Behaviorally identical in the current
  feishu-only world; "one agent, at most one IM overall" stays an app-layer
  rule per MUL-3515 §4, not a DB constraint.

- CreateChannelUserBinding merges jsonb_strip_nulls(EXCLUDED.config) so a
  re-bind carrying {"union_id": null} no longer erases an already-captured
  union_id, restoring the old COALESCE(EXCLUDED.union_id, ...) semantics.

Regenerated with sqlc v1.31.1. Verified on PG17: re-install replaces in
place, feishu+slack coexist, null re-bind keeps union_id, real union_id wins.

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): channel-backed Feishu store + fix base64 backfill wrapping

Cutover step 1 of switching the lark Go code from lark_* onto the channel_*
tables (MUL-3515). Introduces the JSONB config boundary the rest of the
cutover sits on, and fixes a latent backfill bug surfaced while building it.

- migration 123: strip newlines from the app_secret_encrypted base64 backfill.
  PostgreSQL encode(...,'base64') MIME-wraps at 76 chars, and a secretbox-
  sealed ~72-byte secret exceeds that. Go's encoding/json decodes a JSON
  string into []byte with base64.StdEncoding, which rejects embedded newlines,
  so without the strip every migrated installation would fail to decrypt its
  app secret once reads move to channel_installation.config.

- store.go: flat domain types (Installation / UserBinding / ChatSessionBinding)
  with field parity to the retired db.Lark* rows, plus the feishu config codec.
  Row->domain mappers decode the JSONB config; the secret decoder is
  whitespace-tolerant so legacy MIME-wrapped data still round-trips, while the
  encoder emits unwrapped base64. Binding config encodes an absent union_id as
  "{}" so the upsert's jsonb_strip_nulls merge never clobbers a stored union_id.

- store_test.go: 72-byte secret round-trip, MIME-wrapped tolerance, optional
  null-strip, and flat-column preservation. Verified on PG17.

Field parity keeps the upcoming ~190 db.LarkInstallation call sites a
mechanical rename. No call sites switched yet; behavior unchanged.

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): route inbound integration onto channel_* + explicit membership checks

Cutover step 2 (MUL-3515): switch the Feishu Go code from the lark_* queries to
channel_* via a ChannelStore adapter, and replace the removed member foreign key
with explicit application-layer membership checks. No user-visible behavior change.

- channel_store.go: ChannelStore embeds *db.Queries and SHADOWS the ~24 lark
  query methods with channel_*-backed equivalents, keeping the db.Lark*
  signatures so the dispatcher/hub/services and their ~20k lines of tests stay
  untouched; the feishu JSONB config is (de)coded by store.go. Adds
  IsWorkspaceMember and a tx-aware WithTx. Only production wiring swaps
  *db.Queries for *ChannelStore.

- Membership re-check (§4 removed the lark_user_binding -> member FK, so a
  binding row no longer proves current membership):
  * the dispatcher inbound identity step verifies membership after the binding
    lookup; a former member's stale binding is dropped as non_workspace_member
    + audited and never reaches chat_session (§4.3 safety property).
  * RedeemAndBind and BindInstallerTx replace the now-dead FK (23503) branch
    with an explicit IsWorkspaceMember gate, preserving the existing
    ErrBindingNotWorkspaceMember outcome without burning the token.

- router wires the ChannelStore into the patcher, typing indicator, dispatcher,
  hub, and the union_id/region backfills; constructor-based services wrap
  *db.Queries internally so their signatures and nil-check tests are unchanged.

Verified: go build ./... ; go vet ; gofmt ; go test -race ./internal/integrations/...
(full lark suite green unchanged + new membership drop/error tests). Adapter
field mappings (secret base64, union_id RMW, chat-id/open-id remaps, dedup,
token, card) checked end-to-end against a PG17 channel_* schema.

lark_* tables and queries remain (unused at runtime) until the S3 cleanup-hooks
and S4 drop-tables/rename commits.

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): renumber generalization migration 123 -> 124

main merged 123_issue_stage after this branch forked, so the branch's 123_channel_generalization now collides on the migration number. The runner keys schema_migrations by full version string and would still apply both, but a duplicate number is a merge hazard and convention violation, so move the channel migration to the next free slot (124).

issue_stage (ALTER issue ADD COLUMN stage) and the channel generalization touch disjoint tables; verified on PG17 that 123_issue_stage applies cleanly on a DB already carrying 124_channel_generalization, so the two are order-independent. sqlc regenerated (v1.31.1): only the migration-number comment changed.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* feat(channel): prune channel bindings on member removal + chat session delete

MUL-3515 §4 dropped every channel_* foreign key, so the old ON DELETE CASCADE that cleared a user's channel_user_binding when they left a workspace, and a chat's channel_chat_session_binding when its chat_session was deleted, no longer fires. Re-establish that integrity in the application layer, inside the existing transactions: revokeAndRemoveMember -> DeleteChannelUserBindingsByWorkspaceMember, DeleteChatSession -> DeleteChannelChatSessionBindingBySession.

Adds real-DB tests for both paths, including a scoping check that a remaining member's binding survives the prune. Verified on PG17: both new tests plus the existing revocation tests and the full handler package pass.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): scope Lark/Feishu store reads to channel_type='feishu'

The S2 cutover routed the Feishu integration onto channel_*, but the Lark-facing ChannelStore wrappers read installation / chat-session-binding / outbound-card rows across ALL channel_type values. Once a second IM exists, that would let the Lark hub supervise a non-Feishu installation, the Lark install list show it, /lark/installations/{id} revoke another channel's row, and the outbound patcher / typing indicator act on a non-Feishu chat binding or card.

Add a channel_type predicate to the six read/list channel queries and pass channelTypeFeishu from every wrapper: GetChannelInstallation, GetChannelInstallationInWorkspace, ListChannelInstallationsByWorkspace, ListActiveChannelInstallations, GetChannelChatSessionBindingBySession, GetChannelOutboundCardByTask.

The S3 cleanup deletes (DeleteChannelUserBindingsByWorkspaceMember / DeleteChannelChatSessionBindingBySession) stay all-channel on purpose: a member leaving or a chat_session being deleted should clear every IM's binding. Adds a real-DB test that seeds a Slack installation/binding/card next to the Feishu ones and asserts the Lark wrappers never return them.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* refactor(channel): replace db.Lark* translation layer with lark domain types

S2 introduced ChannelStore as a translation layer that read/wrote channel_* but kept the retired db.Lark* struct/param shapes so the dispatcher/hub/services and their ~20k lines of tests did not have to change. This collapses that layer: the store now takes and returns the package's flat domain types (Installation, UserBinding, ChatSessionBinding, InboundMessageDedup, BindingTokenRow, OutboundCardMessage) and the *Params types in params.go, with channel-neutral field names (ChannelUserID / ChannelChatID / ...). All call sites, fakes, and tests move to the domain types.

No behavior change: only channel_* is read/written (as before); db.Lark* is now unused, and the lark_* tables + queries/lark.sql are removed in the next commit. Verified on PG17: go build / vet / gofmt clean, go test -race ./internal/integrations/... green (the ~20k-line fake suite), and the lark + handler suites pass.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* refactor(channel): drop lark_* tables and queries (remove old path)

The Go cutover (previous commit) moved the lark package entirely onto channel_* and the domain types, leaving the lark_* tables, queries/lark.sql, and the generated db.Lark* models unused. Remove them per the design (§5: replace, do not keep both): migration 125 drops the seven lark_* tables (data already lives in channel_* since migration 124), and queries/lark.sql is deleted + sqlc regenerated, removing the db.Lark* models and lark query methods.

The 125 down recreates the authoritative pre-drop schema (bot_union_id, region, per-installation dedup PK, thread-reply columns). Verified on PG17: fresh migrate up ends with lark_* gone + channel_* present; isolated 125 down/up round-trips correctly; go build / vet / gofmt clean; go test -race ./internal/integrations/... and the handler suite pass.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(migrations): remove trailing blank line at EOF of 125 down migration

git diff --check flagged a blank line at EOF of 125_drop_lark_tables.down.sql (a pg_dump-generation artifact). Whitespace only; the recreate SQL is unchanged.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* refactor(channel): defer lark_* table drop to a follow-up migration

Preflight deploy review: dropping lark_* in the same release that cuts over (old migration 125) is not rollback/rolling-safe — the v0.3.27 release still reads lark_*, so a rolling deploy or a post-deploy code rollback would hit "relation does not exist". Remove the drop and keep the old tables for one release (standard expand/contract): migration 124 already backfilled lark_* -> channel_*, the new code reads/writes only channel_*, and the physical drop moves to a separate cleanup migration once this ships and is observed.

The lark_* tables remain in the schema, so sqlc regenerates the (now unused) db.Lark* models; queries/lark.sql stays deleted (the new code uses channel_*). No code path reads lark_* — only the destructive drop is deferred, keeping the design's no-compat-layer / no-dual-write rule while being deploy-safe.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): skip orphaned installations in hub-boot active scan

Preflight deploy review: channel_installation dropped the workspace/agent FK (MUL-3515 §4), so unlike lark_installation it does not cascade away when its workspace is deleted or its agent is hard-deleted (e.g. runtime teardown). The hub-boot query then keeps opening a WebSocket for a bot whose owner is gone.

JOIN ListActiveChannelInstallations to live workspace + agent so an orphaned installation is never connected, uniformly for every deletion path. The JOIN matches the old ON DELETE CASCADE semantics (row existence, not agent archival), so an archived-but-present agent's installation is still listed; the orphaned row's encrypted secret is thereby never decrypted/used.

Tests: a real-DB handler test asserts a deleted-workspace/agent installation and a non-Feishu one are both excluded; the lark scope test's active-list assertion moved there since the JOIN now needs real workspace/agent fixtures. (Physically deleting dormant orphaned channel rows on workspace/agent deletion is a separate app-layer-cleanup follow-up.)

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* docs(channel): document non-rolling cutover constraint for the lark->channel migration

Elon deploy review: keeping the lark_* tables (deferred drop) stops old v0.3.27 code from crashing, but is not full expand/contract. Migration 124 is a one-time backfill; afterwards new code runs on channel_* (lease + dedup on channel_*) while pre-cutover code runs on lark_* (lease + dedup on lark_*). If both run concurrently during a rolling deploy, each side claims the same Feishu bot's WS lease on its own table and double-processes inbound events.

This release therefore requires a NON-ROLLING cutover (stop the old hub before applying migration 124 + starting new code; rollback is not lossless once new code writes channel_*). Documented where deployers/reviewers see it: migration 124 header gains a ROLLOUT note; the channel_store.go header is corrected (lark_* tables are retained one release for rollback safety, not "gone"; the store still never touches them). Comment-only — no schema/codegen/behavior change.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): add MULTICA_LARK_HUB_DISABLED switch for the channel cutover

The lark_*->channel_* cutover needs a way to make the Feishu bot briefly unavailable WITHOUT taking down the whole multica-api process — the Lark hub is a goroutine inside it, not a separate Deployment. MULTICA_LARK_HUB_DISABLED=true parks the hub at startup: the API serves HTTP normally but never claims a WS lease or opens a Feishu connection.

Rollout (see migration 124 ROLLOUT note): ship the new release with the flag SET so new pods run API-only while old pods (hub on lark_*) drain during the rolling deploy — the two hubs never overlap. After the old pods are gone and migration 124 has run, flip the flag off; the new hub comes up on channel_*. The old backend does NOT need this switch — its hub stops when k8s terminates the old pods, not via a flag. Nil-ing LarkHub reuses the existing not-configured path so both the startup start and the shutdown join skip it.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* docs(channel): point migration 124 ROLLOUT note at the hub-disable switch

Refine the rollout note to use MULTICA_LARK_HUB_DISABLED for a bot-only cutover (new pods serve API with the hub parked while old pods drain; flip the switch off after the migration), instead of the earlier whole-API recreate. Comment-only.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* docs(channel): fix migration 124 rollout order and document self-host cutover

The previous ROLLOUT note shipped the new (channel_*) build before
running migration 124, so the channel_*-backed HTTP paths (installation
list/install/revoke, chat-session delete, member revoke) would 500 in
the window between new-pod boot and the deferred migration. Restate the
runbook around two explicit invariants — channel_* must exist before the
new build serves those paths, and the old/new hubs must never overlap —
and order the steps so channel_* is created first (park old hub -> snapshot
-> deploy parked new build -> unpark). Document that default self-host
(entrypoint migrate + single-replica Recreate) satisfies both invariants
automatically and needs no manual steps; only prd / multi-replica rolling
self-host needs the switch procedure. Clarify in main.go that the
hub-park switch is generation-agnostic (parks whichever hub the build
carries), which is what enables the preparatory release.

Refs MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:46:20 +08:00
Bohan Jiang
3103ed1082 fix(agent): surface Antigravity provider log failures (#4494)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:45:52 +08:00
Ryan Yu
4064b164be docs: add agent runtime install page to zh nav (#4480) 2026-06-24 12:40:30 +08:00
Multica Eve
131ca80a6c refactor(autopilot): migrate scheduled dispatch to scheduler.Manager (3/3 MUL-3551) (#4444)
* refactor(autopilot): migrate scheduled dispatch to scheduler.Manager

PR 3 of 3 for the scheduled-Autopilot refactor on MUL-3551.

Replaces the legacy cmd/server/autopilot_scheduler.go goroutine
(30 s app-clock polling, app-time cron advancement, weak crash
recovery) with a JobSpec registered on the existing
scheduler.Manager. sys_cron_executions is now the lease + audit
table for scheduled Autopilot occurrences, and the unique key on
(job_name, scope_kind, scope_id, plan_time) is the primary
guarantee that the same planned fire time cannot produce two runs.

What changed

  * server/internal/scheduler/jobs_autopilot.go
    New AutopilotScheduleDispatchJob factory:
      - scope_kind = "autopilot_trigger", scope_id = trigger.id
      - PlansForScope hook (from PR 1) enumerates cron occurrences
        in (lastPlan, dbNow] and collapses missed fires to the most
        recent one (CatchUpLatestOnly — same policy the legacy
        goroutine had, now provable via a one-row-per-tick audit).
      - Handler re-loads trigger + autopilot inside the handler so a
        between-tick state change (paused, disabled, deleted) takes
        effect immediately and is recorded as a no-op SUCCESS row
        with skipped_reason in the result JSON.
      - Calls AutopilotService.DispatchAutopilotForPlan (from PR 2)
        for the actual run creation; that path is itself idempotent
        on (trigger_id, planned_at), so a stale-steal retry reuses
        the run created by the prior attempt instead of duplicating.
      - RunTimeout=2m, StaleTimeout=5m, HeartbeatInterval=30s,
        AllowStaleReentry=true, MaxAttempts=3, RetryBackoff
        [1m, 5m, 15m], MaxPlansPerTick=5 (safety cap).

  * server/internal/scheduler/manager.go
    Manager.runOnce promoted to RunOnce (exported) so external test
    packages can drive deterministic ticks; existing call sites in
    this package + cmd/server tests updated.

  * server/internal/service/cron.go
    NextOccurrenceAfterUTC and NextOccurrencesUTC: cron evaluators
    that take an explicit "now" instant. Callers pass dbNow() so
    schedule decisions stay consistent across app instances with
    clock skew. Legacy ComputeNextRun is preserved (delegating to
    NextOccurrenceAfterUTC with time.Now()) for the display-only
    autopilot_trigger.next_run_at write path — scheduling decisions
    no longer use it.

  * server/pkg/db/queries/autopilot.sql
    ListSchedulableAutopilotTriggers replaces the legacy
    ClaimDueScheduleTriggers (the new path no longer mutates
    autopilot_trigger.next_run_at on claim). RecoverLostTriggers
    removed — sys_cron_executions lease theft now handles crash
    recovery without an in-handler restart sweep.

  * server/cmd/server/main.go
    The "go runAutopilotScheduler(...)" line is gone. The new
    JobSpec is registered alongside TaskUsageHourlyJob on the
    existing schedulerMgr (still using sweepCtx for lifecycle).

  * server/cmd/server/autopilot_scheduler.go DELETED.

Tests

  * server/internal/service/cron_test.go — unit tests for the cron
    helpers: timezone-aware enumeration, half-open (after, until]
    window, plan_time-exclusive "after", invalid inputs surface
    parse errors, and the "ignores wall clock" property the
    scheduler relies on.

  * server/cmd/server/autopilot_schedule_job_test.go — DB-backed
    integration tests:
      - DispatchesOnce: one tick → 1 SUCCESS exec row + 1
        autopilot_run with planned_at set; a second tick does not
        regress the count.
      - MissedSchedulesCollapse: an hour of missed */5 fires
        produce a single autopilot_run, not 12.
      - CrashRecovery: simulated stale RUNNING lease at the same
        plan_time → second tick reclaims it and DOES NOT duplicate
        autopilot_run.
      - TwoRunnersSingleWinner: two concurrent
        scheduler.Manager instances on the same trigger →
        per-plan_time uniqueness holds (sys_cron_executions never
        has two RUNNING rows at the same plan_time, autopilot_run
        count == exec row count).
      - DisabledTriggerSkips: a trigger disabled between
        scope-list and tick produces no exec row.
      - PausedAutopilotSkipsAtHandler: an autopilot paused after
        the first tick does not produce a new exec row.
      - BadCronFailsLoudly: an invalid cron expression never fires
        dispatch (parse error surfaces in the plan hook).
    Existing autopilot listener / squad / dispatch tests still
    pass.

  * server/internal/scheduler/plans_for_scope_test.go from PR 1
    still passes (RunOnce rename only).

Verification

  * go build ./...
  * go vet ./...
  * go test ./internal/scheduler ./internal/service ./cmd/server
    ./internal/handler — all green.

Rollback

  * Reverting this commit re-introduces the legacy goroutine.
    Migration 124 (PR 2) and the scheduler hook (PR 1) stay in
    place. Autopilot data on disk is forward- and backward-
    compatible: planned_at columns are nullable, the legacy
    goroutine never reads planned_at and the new job never reads
    autopilot_trigger.next_run_at.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): scheduler hook retries FAILED plans + tighten tests

Review fix for #4444 (MUL-3551).

Blocker: hook planner skipped the FAILED-with-retry plan_time

`autopilotPlansForScope` unconditionally set
`after = latest.PlanTime` when `latest.Found`, then enumerated cron
occurrences in the half-open interval `(after, dbNow]`. That
EXCLUDED the FAILED plan_time itself, so `tryClaim`'s
"FAILED-with-retry" branch — which only fires when the planner
returns the same plan_time — never ran. A claim + crash sequence
left the FAILED row stuck at attempt<max_attempts forever and the
scheduled occurrence was lost (MUL-3551 acceptance ③).

Fix: hook now branches on `latest.RetryEligible(now)` BEFORE
computing `after`. When the most recent stored row is FAILED with
attempts remaining and next_retry_at <= dbNow, the hook returns
`[latest.PlanTime]` unchanged. tryClaim's retry-from-FAILED path
fires, attempt increments, the run is retried, and the audit row
reaches SUCCESS at the same plan_time. Mirrors the cadence
planner's `info.RetryEligible(now)` branch in manager.plansForTick.

Tests

  * TestAutopilotScheduleJobCrashRecovery rewritten to actually
    pin the retry contract instead of just "no duplicate run":
      - assert first attempt completes at attempt=1 with a real
        task_id linkage (the "complete" snapshot the retry must
        reuse);
      - simulate a crash mid-dispatch (status=RUNNING, expired
        stale_after, ghost lease_token);
      - assert tick 2 transitions the SAME exec row (same plan_time)
        to status=SUCCESS at attempt=2 (proving the planner did
        NOT skip past the FAILED bucket);
      - assert autopilot_run stays at exactly one row, reused from
        the first attempt — proving DispatchAutopilotForPlan's
        complete-run reuse path is what closes the loop.

  * TestAutopilotScheduleJobPausedAutopilotSkipsAtHandler rewritten
    to invoke `job.Handler` directly (the previous version drove
    `mgr.RunOnce` which short-circuited at the scope-list SQL
    filter and never reached the handler). The new test pauses the
    autopilot AFTER setup, calls the handler with a fabricated
    HandlerInput, and asserts the handler returns
    skipped_reason=autopilot_inactive without creating an
    autopilot_run.

  * TestAutopilotScheduleJobBadCronFailsLoudly renamed to
    TestAutopilotScheduleJobBadCronStaysSilent and updated to
    match the real implementation: a parse error in the plan hook
    surfaces as a manager-level warning log, NOT a
    sys_cron_executions row (no plan_time was ever claimed). The
    test now asserts zero exec rows AND zero autopilot_run rows,
    documenting that bad cron is a permanent configuration error
    (caught at HTTP create/update time first), not a transient
    failure that belongs in the retry envelope.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:09:07 +08:00
Multica Eve
eca6102365 refactor(autopilot): autopilot_run.planned_at + DispatchAutopilotForPlan (2/3 MUL-3551) (#4443)
* refactor(scheduler): add PlansForScope hook for non-cadence jobs

The current Manager.plansForTick assumes a uniform Cadence grid:
plan_times are derived via FloorPlan(db_now, Cadence). That works for
rollup_task_usage_hourly but not for the upcoming Autopilot schedule
dispatch job, where each trigger has its own cron expression and the
plan_times do not snap to a single global grid.

This change adds an optional JobSpec.PlansForScope hook. When set:

  * Manager loads the latest stored plan for (job, scope) and passes
    a new LatestPlanInfo to the hook (exported from the previously
    private latestPlanInfo). The hook returns the plan_times to attempt
    this tick.
  * Cadence, CatchUpMode and CatchUpWindow are bypassed; the hook is
    in full control of plan_time selection.
  * MaxPlansPerTick still acts as a safety cap on the hook's output.
  * All other timing fields (RunTimeout / StaleTimeout /
    HeartbeatInterval / MaxAttempts / RetryBackoff / AllowStaleReentry)
    and the lease/heartbeat/terminal-write SQL primitives are reused
    unchanged.

JobSpec.validate now allows Cadence=0 when PlansForScope is set, and
makes the every_plan MaxPlansPerTick > 0 invariant fire only on
Cadence-driven every_plan jobs. Existing rollup_task_usage_hourly
behaviour is unchanged — that JobSpec leaves PlansForScope nil.

Tests:
  * TestJobSpecValidatePlansForScopeRelaxesCadence — validate() rules.
  * TestManagerPlansForScopeHookDrivesPlans — end-to-end hook delegation
    through the manager (DB-backed), proving that hook-returned
    plan_times go through the same tryClaim path, MaxPlansPerTick
    truncates without erroring, and LatestPlanInfo is populated on the
    second tick.
  * TestManagerPlansForScopeHookEmptyIsNoOp — empty hook output is a
    valid no-op.

No behaviour change for callers that don't set PlansForScope.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

* refactor(autopilot): add planned_at + DispatchAutopilotForPlan for occurrence idempotency

PR 2 of 3 for the scheduled-Autopilot refactor on MUL-3551.

Adds dispatch-layer idempotency for scheduled triggers. This is the
second line of defence behind the primary uq_sys_cron_execution
guarantee in sys_cron_executions: if a runner crashes between
"create autopilot_run" and "write SUCCESS in sys_cron_executions",
the next stale-steal retry re-enters dispatch with the SAME
(trigger_id, planned_at). Without a row-level guard, that retry
would create a duplicate autopilot_run, issue, and task.

Changes:

  * Migration 124: ALTER TABLE autopilot_run ADD COLUMN planned_at
    TIMESTAMPTZ + partial unique index on (trigger_id, planned_at)
    WHERE both are NOT NULL. Manual / webhook / api dispatch leaves
    planned_at NULL so they keep the existing semantics unchanged.

  * autopilot.sql: CreateAutopilotRun now takes planned_at;
    GetAutopilotRunByTriggerAndPlanned is the fast-path lookup used
    by DispatchAutopilotForPlan to detect a prior attempt's row
    without burning an INSERT.

  * service.DispatchAutopilotForPlan: new entry point for scheduled
    triggers that already know the canonical UTC plan_time of the
    occurrence they are firing. Looks up an existing run for
    (trigger_id, planned_at) and reuses it on a stale-steal retry;
    otherwise dispatches normally with planned_at stamped on the
    new run.

  * service.DispatchAutopilot keeps its current signature for
    manual / webhook / api callers (planned_at stays NULL).

  * recordSkippedRun also threads planned_at so the skip path
    participates in the same partial-unique guarantee.

  * sqlc v1.31.1 regenerated autopilot.sql.go + models.go.
    Unrelated workspace.sql.go drift restored.

Tests (against local Postgres):

  * TestDispatchAutopilotForPlanIsIdempotent — first call creates a
    run; second call with same (trigger, planned_at) reuses it
    (autopilot_run row count stays at 1); third call with a different
    planned_at on the same trigger creates a second run (proves we
    are not collapsing legitimate occurrences).

  * TestDispatchAutopilotForPlanRejectsZeroArgs — invalid trigger_id
    and zero planned_at both fail loudly so callers cannot silently
    disable the idempotency guard.

  * Existing autopilot listener / squad / dispatch tests all still
    pass.

This PR has no scheduler / handler / UI behaviour change on its own:
the new entry point exists but is not yet wired into the schedule
goroutine. PR 3 will register the autopilot_schedule_dispatch
JobSpec that consumes it and remove the legacy
cmd/server/autopilot_scheduler.go path.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): DispatchAutopilotForPlan recovers partial-state runs

Review fix for #4443 (MUL-3551).

Before this change, DispatchAutopilotForPlan returned ANY existing
autopilot_run for (trigger_id, planned_at), including the
half-written rows produced when a runner crashed between
"CreateAutopilotRun" and "create downstream issue/task". The
scheduler handler would then write SUCCESS in sys_cron_executions
even though no issue or agent task was ever created, silently
losing the scheduled occurrence.

Fix:

  * New isAutopilotRunComplete helper classifies an existing run:
      - terminal status (completed / failed / skipped) → reuse.
      - issue_created with valid issue_id → reuse (the issue
        listener owns task creation from here).
      - running with valid task_id → reuse (the task is queued).
      - anything else → partial; must NOT short-circuit.

  * New SQL RecoverPartialAutopilotRun marks a partial row FAILED
    with a recovery reason AND clears its planned_at. The cleared
    planned_at releases the partial-unique slot in
    uq_autopilot_run_trigger_planned, letting the fresh dispatch
    INSERT a new row at the same (trigger_id, planned_at) without
    conflict.

  * DispatchAutopilotForPlan now branches on the lookup:
    complete run → return; partial run → recover-then-fresh-
    dispatch; not-found → fresh dispatch. The fresh dispatch path
    still goes through dispatchAutopilot, so the new row carries
    the real issue_id / task_id by the time the handler returns.

  * Tests: TestDispatchAutopilotForPlanRecoversPartialRun seeds a
    partial run (status='running', task_id=NULL for run_only;
    status='issue_created', issue_id=NULL for create_issue) and
    asserts the retry:
      - returns a DIFFERENT run row (no false reuse);
      - leaves the partial row in status='failed', planned_at=NULL,
        with a non-empty failure_reason for ops;
      - produces a fresh row with planned_at preserved AND the
        appropriate downstream linkage (task_id for run_only,
        issue_id for create_issue);
      - exactly one live row at (trigger_id, planned_at) after
        recovery, so the partial-unique constraint is honoured.

Existing TestDispatchAutopilotForPlanIsIdempotent and
TestDispatchAutopilotForPlanRejectsZeroArgs still pass — the
complete-reuse path is unchanged for the realistic SUCCESS-state
case.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:01:10 +08:00
Multica Eve
f9ed62f075 refactor(scheduler): add PlansForScope hook for non-cadence jobs (#4442)
The current Manager.plansForTick assumes a uniform Cadence grid:
plan_times are derived via FloorPlan(db_now, Cadence). That works for
rollup_task_usage_hourly but not for the upcoming Autopilot schedule
dispatch job, where each trigger has its own cron expression and the
plan_times do not snap to a single global grid.

This change adds an optional JobSpec.PlansForScope hook. When set:

  * Manager loads the latest stored plan for (job, scope) and passes
    a new LatestPlanInfo to the hook (exported from the previously
    private latestPlanInfo). The hook returns the plan_times to attempt
    this tick.
  * Cadence, CatchUpMode and CatchUpWindow are bypassed; the hook is
    in full control of plan_time selection.
  * MaxPlansPerTick still acts as a safety cap on the hook's output.
  * All other timing fields (RunTimeout / StaleTimeout /
    HeartbeatInterval / MaxAttempts / RetryBackoff / AllowStaleReentry)
    and the lease/heartbeat/terminal-write SQL primitives are reused
    unchanged.

JobSpec.validate now allows Cadence=0 when PlansForScope is set, and
makes the every_plan MaxPlansPerTick > 0 invariant fire only on
Cadence-driven every_plan jobs. Existing rollup_task_usage_hourly
behaviour is unchanged — that JobSpec leaves PlansForScope nil.

Tests:
  * TestJobSpecValidatePlansForScopeRelaxesCadence — validate() rules.
  * TestManagerPlansForScopeHookDrivesPlans — end-to-end hook delegation
    through the manager (DB-backed), proving that hook-returned
    plan_times go through the same tryClaim path, MaxPlansPerTick
    truncates without erroring, and LatestPlanInfo is populated on the
    second tick.
  * TestManagerPlansForScopeHookEmptyIsNoOp — empty hook output is a
    valid no-op.

No behaviour change for callers that don't set PlansForScope.

Refs MUL-3551

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:00:34 +08:00
Naiyuan Qing
8a0934c741 fix(editor): track @mention selection by identity, not slot index (MUL-3607) (#4488)
The @mention popup tracked the highlighted row with a positional integer
(selectedIndex into displayItems), but rows are rendered in the re-bucketed
order produced by groupItems() (current → recent → search → users → issues).
An async server "search" result is appended to the END of displayItems yet
hoisted near the TOP on render, so the highlighted row and the committed
item pointed at different entries — you navigate to one target but mention
its neighbour. A separate useEffect also force-reset selectedIndex to 0 on
every displayItems change, snapping an active selection back to the first row
whenever async results landed.

Root cause: a slot index is not a stable target for a list whose order and
length change asynchronously. Track selection by item identity instead:

- Replace selectedIndex state with selectedKey (the item's `type:id`).
- Derive groups/orderedItems (the exact rendered order) and resolve the
  numeric index from selectedKey against orderedItems; fall back to row 0
  when the pinned item is gone or nothing is picked yet.
- Keyboard nav, Enter, clicks, highlight, and scroll all index orderedItems,
  so the highlighted row always equals the committed item.
- Drop the force-reset effect; identity-based selection self-heals across
  reorders and async arrival without resetting an active pick.

Adds a regression test asserting the highlighted row equals the committed
item when groupItems reorders the list, both initially and after ArrowDown.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 10:53:51 +08:00
Naiyuan Qing
d43840f322 Revert "fix(editor): index @mention selection by rendered order (MUL-3607) (#…" (#4489)
This reverts commit 1890a9fa19.
2026-06-24 10:50:53 +08:00
Naiyuan Qing
1890a9fa19 fix(editor): index @mention selection by rendered order (MUL-3607) (#4487)
The @mention popup navigated and committed by indexing the flat
`displayItems` array, but rendered rows in the re-bucketed order from
`groupItems()` (current → recent → search → users → issues). In chat
(context mode) the async server-search results are appended to the end
of `displayItems` yet tagged `group:"search"`, so `groupItems()` hoists
them near the top. The highlighted row and the committed item then point
at different entries — you select one target but mention its neighbour
(the reported "@bohan picks the next one" off-by-one).

Make the flattened grouped order (`orderedItems`) the single index space
for `selectedIndex`, arrow keys, Enter, and clicks, so the highlighted
row is always the committed item. Plain issue-comment mentions (default
mode) were already safe — no group tags means `groupItems()` is
order-preserving — and stay unchanged.

Adds a regression test asserting highlighted row == committed item when
the list is reordered by a hoisted search result.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 10:46:03 +08:00
Naiyuan Qing
b79777caec feat(comments): resolve-aware fold for agent comment reads (MUL-3555) (#4463)
* feat(comments): resolve-aware fold for agent comment reads (MUL-3555)

Agents reading a long issue paid tokens for settled discussion. The human
timeline already folds resolved threads, but the agent read path
(`comment list`) ignored resolved_at entirely — humans saw the conclusion,
agents got the full raw discussion.

Add an opt-in `fold=true` projection to ListComments that collapses each
resolved thread to root + conclusion (reply-resolved) or root only
(root-resolved), reusing the human timeline's deriveThreadResolution
semantics. The resolved thread's root carries `thread_resolved` +
`folded_count`; `--full` brings the dropped comments back. Fold is rejected
on partial-thread reads (since/tail) and roots_only, where a resolution
comment could be unfetched and silently dropped.

CLI `comment list` folds by default on the complete-thread reads (default,
--recent, untailed --thread) with a `--full` escape hatch; the agent
prompts and runtime brief document the fold + escape. No new endpoint, no
human UI change, no SQL/migration change — in-memory projection, same
precedent as summary/roots_only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(daemon): dedupe fold prompt restatements per review (MUL-3555)

Howard's PR review flagged DRY redundancy: the resolve-fold rule was
restated in full in the task prompt (prompt.go:41/:182) and the brief
workflow steps (runtime_config.go:673/:692, reply_instructions cold
hint) even though the canonical command catalog (runtime_config.go:477)
— always present in the brief — already documents it in full, and the
task prompt explicitly defers to it ("follow the rule in your runtime
workflow file").

Keep the catalog entry full (the canonical reference); shrink the five
inline restatements to a short "resolved threads come back folded —
`--full` to expand" pointer. No loss of signal (the agent always has the
full catalog in context), ~80-120 tokens/run saved on the worst-case
assignment / cold paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 09:52:18 +08:00
Naiyuan Qing
a898e5317e fix(chat): stop cancelled-run drafts resurrecting after deletion (#4486)
Cancelling a chat task restored the message into the input via a
separate `editorRestore` component state that also held a private copy
of the content and forced an editor remount (bumping `editorKey`).

That copy was never cleared and `activeRestore` re-derived on every
return to the draft's session, so `defaultValue={activeRestore?.content
?? inputDraft}` kept re-injecting text the user had already deleted —
the draft (`inputDraft`) cleared, but the stale copy did not.

The editor already self-syncs external `defaultValue` changes into a
live instance (content-editor.tsx defaultValue-sync effect, used for WS
description updates), so the remount mechanism was redundant. Drop the
whole `editorRestore` state and let `inputDraft` be the single source of
truth: restore just writes the draft, the editor's own sync displays it.
Now cancel-restore behaves exactly like normal typing — delete the text
and it stays gone across navigation.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 09:33:56 +08:00
Bohan Jiang
ac84b8c70c fix(agent): stop Antigravity turns dying at agy's hidden 5m print-timeout (#4462)
agy's --print-timeout defaults to 5m when the flag is omitted, but the
daemon treated "omit the flag" as "no cap". In the default no-cap config
every Antigravity turn was therefore silently capped at 5 minutes: any
run whose build/tests outlived the budget had agy abort mid-turn, print
"Error: timed out waiting for response", and exit 0 — which the backend
recorded as a successful "completed" with truncated output (the reported
"Antigravity disconnects", MUL-3570 / #4453).

- Always pass --print-timeout: the configured cap when positive, else a
  large value (24h) that defers to the daemon's idle/tool watchdogs.
- Detect agy's print-mode timeout marker in the run log and surface the
  result as a timeout instead of a truncated success.

Verified by reproducing against agy 1.0.8 and with new unit + end-to-end
backend tests.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
v0.3.28
2026-06-23 18:57:50 +08:00
Multica Eve
be5fd7d3f0 MUL-3577: add 2026-06-23 changelog entry (#4461)
* docs: add 2026-06-23 changelog entry

Co-authored-by: multica-agent <github@multica.ai>

* docs(changelog): sharpen 0.3.28 title and handoff wording

- Headline the two flagship features in the title: staged Sub-Issues and Qoder runtime support
- Rewrite the vague agent-handoff line to spell out the pre-trigger confirmation (whether/which agent will start, apply without running) and the handoff note as next-run context
- Apply across en/ja/ko/zh

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: J <j@multica.ai>
2026-06-23 18:50:44 +08:00
Bohan Jiang
cfc488769b MUL-3574: update runtime and CLI docs (#4460)
* docs: update runtime and CLI docs for MUL-3574

Co-authored-by: multica-agent <github@multica.ai>

* docs: address runtime docs review for MUL-3574

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 17:26:12 +08:00
Bohan Jiang
294953ba37 fix: delete custom runtime profiles from runtime rows (#4456)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 16:52:46 +08:00
Naiyuan Qing
2857a4c649 fix(transcript): live-update issue/agent transcript dialog from shared cache (#4452)
The transcript dialog opened from a running task's row showed only a
one-shot snapshot taken at open time: TranscriptButton fetched once via
api.listTaskMessages and cached it locally, never subscribing to the
shared ["task-messages", taskId] cache that the WS task:message stream
already seeds. New tool calls / thinking / text never appeared until the
task finished or the page reloaded.

Add a live-cache mode to the shared TranscriptButton: when isLive and the
parent provides no items and the task id is a persisted UUID, render from
the shared task-messages cache so the open dialog grows in real time. On
open (and again on the running→terminal transition) force a backfill via
api.listTaskMessages and merge it into the cache by seq — taskMessagesOptions
is staleTime:Infinity, so a plain subscription never heals a WS reconnect
gap. The cache observer is read-only (enabled:false) so React Query never
blind-replaces the cache; only the WS handler and the seq-merged backfill
write it. The subscription mounts only while the dialog is open, so closed
live rows add no baseline requests; terminal tasks keep the lazy one-shot
fetch.

Covers issue execution-log and agent activity. Autopilot issue-less
run_only live log is out of scope: the backend doesn't broadcast
task:message for tasks with no issue/chat session, so there's nothing to
subscribe to — backend broadcast unchanged.

Extract mergeTaskMessagesBySeq into packages/core/chat/queries.ts and route
both the realtime task:message handler and the new backfill through it, so
there is one seq-merge semantics for that cache instead of two.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 16:35:29 +08:00
Bohan Jiang
5038c983c0 MUL-3281: Add daemon skill bundle refs (#4445)
* feat: add daemon skill bundle refs

Co-authored-by: multica-agent <github@multica.ai>

* fix: tighten skill bundle resolve safeguards

Co-authored-by: multica-agent <github@multica.ai>

* feat: add task prepare lease

Co-authored-by: multica-agent <github@multica.ai>

* fix: isolate prepare lease concurrent index migration

Co-authored-by: multica-agent <github@multica.ai>

* fix: keep prepare lease active through start

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 16:19:16 +08:00
Naiyuan Qing
3ce97453b3 fix(issues): pre-trigger preview + run-confirm + handoff UX polish (MUL-3375) (#4454)
* fix(issues): stop issue-trigger preview flicker

The pre-trigger preview re-rendered/refetched on every workspace task
event: WS task lifecycle invalidated issueTriggerPreviewAll (staleTime 0),
forcing a background refetch whose isFetching was surfaced as isLoading,
collapsing and reopening CreateRunHint's reveal band.

The assign source (create / assignee change) cancels existing tasks before
enqueuing, so its verdict can't shift from a task event at all; the status
source's pending dedup could, but the preview is advisory and the write
path re-evaluates authoritatively, so a rare stale label is harmless. Drop
the WS invalidation so the preview refetches only on input (signature)
change. Keep the comment-trigger invalidation — its verdict genuinely
changes mid-compose and its chips drive an immediate, unconfirmed send.

Align the hook's data handling with the comment-trigger preview:
keepPreviousData so an input switch swaps in place instead of collapsing,
and treat only the first load (no prior data) as loading.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(issues): skip run-confirm modal for backlog assign

Assigning a Backlog issue to an agent/squad never starts a run (the
parking lot — server/internal/service/issue_trigger.go), so the
pre-trigger confirm modal only rendered an empty "won't start" box with
a single Apply button. Apply directly instead: the single path checks
issue.status, the batch path skips only when every selected issue is
Backlog (mixed selections still confirm — the non-backlog ones trigger).
Mirrors the existing backlog short-circuit in handleBatchStatus.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(modals): run-confirm loading state + submit spinner

The dialog grew in height after open: it rendered the short "won't
start" variant while POST /api/issues/preview-trigger was in flight, then
the note box appeared when the predicate landed. Keep the note box
mounted (disabled) during loading so assign mode opens at its resolved
height, and show a Spinner + 'checking' headline while loading.

Submit had no feedback — buttons only disabled, which read as frozen for
note assigns (the request starts an agent server-side). Track which
footer action is in flight and show a Spinner on the clicked button.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(issues): show handoff note in execution-log trigger text

An assignment-triggered run that carried a handoff note showed the
generic "Initial run" label. Surface the note inline (truncated, like
comment triggers show their text) so the row reads as the handoff.

taskToResponse now populates handoff_note for all callers (dropping the
now-redundant explicit set in ClaimTaskByRuntime); the field is added to
the AgentTask type + zod schema (optional, additive — old clients ignore
it via the loose schema, new clients fall back to "Initial run").

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 16:15:44 +08:00
Bohan Jiang
09f90abb70 MUL-3568 feat(landing): show live GitHub star count on the header GitHub button (#4451)
* feat(landing): show live GitHub star count on the header GitHub button

Add a small client hook (useGithubStars) that fetches stargazers_count
from the GitHub API and a formatStarCount helper that renders it in
GitHub's compact repo-header style (e.g. "37.6k"). The landing header's
GitHub button now appends a star badge (faint divider + filled star +
count) on both the desktop and mobile menu entries.

Fetched client-side on purpose: LandingHeader is shared across every
marketing page, so one client fetch covers them all without threading a
server value through each render site, and each visitor calls the API
from their own IP, sidestepping the shared-outbound-IP rate limit the
server-side github-release fetcher works around with a PAT. The result
is memoized at module scope (plus in-flight dedupe); a failed fetch
caches null and the button degrades to the plain "GitHub" label.

* fix(landing): drop the star glyph from the GitHub star badge

In the GitHub button context the number already reads as the star count,
so the icon is redundant. Keep the divider + count only.
2026-06-23 15:51:14 +08:00
Bohan Jiang
a5636f0ff4 feat: add copy button to readonly code blocks (#4448)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 15:01:34 +08:00
Multica Eve
12ea1f6a8c MUL-3495: support custom runtime args and registration errors (#4408)
* feat: support custom runtime args

Co-authored-by: multica-agent <github@multica.ai>

* fix: address custom runtime review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 14:20:18 +08:00
Reinhard Schneidewind
7008f4276d MUL-3558 feat: add 'issue usage' CLI command for aggregated issue token usage
`GET /api/issues/:id/usage` endpoint (handler `GetIssueUsage`). Returns the
aggregated token usage for an issue, summed across all of its task runs.

The usage is already captured server-side and shown in the issue detail view,
but was not reachable from the CLI, so it couldn't feed billing/cost scripts.
This closes that gap with no backend changes.

$ multica issue usage MUL-139
INPUT_TOKENS  OUTPUT_TOKENS  CACHE_READ  CACHE_WRITE  RUNS
5625          83880          9190806     154078       1

$ multica issue usage MUL-139 --output json
{ "total_input_tokens": 5625, ... "task_count": 1 }

Accepts an issue key (`MUL-139`) or UUID, mirroring `issue runs`.

- Table cells use the existing `formatMetadataValue` helper so large cache-token
  counts render as plain integers (not scientific notation).
- Scope is the per-issue aggregate the endpoint already returns (`task_count` =
  run count); a per-run token breakdown is out of scope.

- `go test ./cmd/multica/ -run TestRunIssueUsage` (added) 
- `go vet ./cmd/multica/` 
- Verified against a live self-hosted server; numbers match the issue UI.

- `server/cmd/multica/cmd_issue.go` — command + handler
- `server/cmd/multica/cmd_issue_test.go` — unit test
- `CLI_AND_DAEMON.md` — docs
2026-06-23 14:13:57 +08:00
Naiyuan Qing
4ab335b8a5 MUL-3416: Issue pre-trigger preview + Handoff Note (#4383)
* feat(issues): unify run-enqueue decision behind WillEnqueueRun + preview endpoint

Collapse the issue update/batch enqueue copies into one service predicate
service.IssueService.WillEnqueueRun, shared verbatim with a new dry-run
endpoint POST /api/issues/preview-trigger so the four entry points stop
drifting (squad/self-loop/batch omissions, MUL-3375). The private-agent gate
stays at the HTTP boundary: write paths inject allow-all, preview injects the
real gate so it never leaks a private agent's readiness.

Add suppress_run to issue update/batch: the change applies but no run starts.
Remove the now-dead handler mirrors shouldEnqueueSquadLeaderOnAssign /
isSquadLeaderReady. service.Create and the comment trigger chain are untouched.

Tests: preview behavior, preview<->write-path match, batch aggregation,
member no-trigger, suppress_run skip, malformed-body 400.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): inject handoff note into assigned runs via first-class task field

Add an optional handoff_note carried by issue assign/promote into the run's
opening prompt and issue_context.md, via a dedicated agent_task_queue column
(migration 122) and a daemon assignment-handoff render branch — never a
fabricated comment, never trigger_comment_id (MUL-3375 §6.1).

Thread the note through enqueueIssueTask/enqueueMentionTask + WithHandoff
public variants and dispatchIssueRun; suppress_run or a parked write drops it
(no run = nothing to inject). Soft version gate: MinHandoffCLIVersion +
HandoffSupported, surfaced per-trigger as handoff_supported in the preview so
the UI can gray the note box on old daemons; the assignment never hard-fails.

Tests: daemon prompt + issue_context render via the assignment branch (not
quick-create/comment), version helper matrix, note persists on the task,
suppressed assign enqueues nothing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): leave a display-only handoff record on the timeline

When an assign/promote with a handoff note starts a run, write one
type='handoff' timeline record via TaskService.RecordHandoff — a direct
Queries.CreateComment + timeline event that bypasses Handler.CreateComment, so
it never reaches triggerTasksForComment and cannot start a second run
(MUL-3375 §6.2, the must-not-retrigger invariant). Author is the actor who
handed off; body is the note. Migration 123 admits the 'handoff' comment type.
Recorded only on a real run start: suppress_run or a parked write writes
nothing. enqueueSquadLeaderTask now reports whether it enqueued so the trace
is gated on an actual dispatch.

Test: exactly one handoff record on assign-with-note, exactly one task (no
re-trigger), and no record when suppressed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): frontend plumbing for issue-trigger preview + handoff (core)

Add api.previewIssueTrigger + IssueTriggerPreviewSchema (zod parseWithFallback),
the use-issue-trigger-preview hook, issueKeys.issueTriggerPreview(+All) with WS
queue-state invalidation, suppress_run/handoff_note on UpdateIssueRequest, the
'handoff' CommentType, and stripping of the control fields from optimistic
update/batch cache patches (MUL-3375 §9).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): exclude handoff records from new-comment counting

type='handoff' is a display-only timeline record, not conversation. Exclude it
from CountNewCommentsSince so a handoff note never inflates the count of
"new comments to catch up on" fed to a claiming agent (MUL-3375 §12). Analytics
already excludes it (RecordHandoff is a direct write that emits no analytics
event), and the comment-trigger path is already bypassed.

Test: a handoff record does not bump the new-comment count; a real comment does.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): pre-trigger preview UI, handoff note, timeline card (web/desktop)

Wire the §9 frontend onto the preview endpoint + handoff fields:
- Delete the backlog blocking dialog (backlog-agent-hint*) and its modal type;
  the over-eager nag is gone. Backlog awareness is now a passive label.
- RunConfirmModal: single assign + batch assign/status route here. Shows the
  backend predicate's verdict ("将启动 @X" / "将启动 N 个" / parked), an optional
  handoff note (assign only, soft-gated by handoff_supported), and 暂不启动 —
  then applies via update/batch. No frontend guessing.
- create modal: passive CreateRunHint ("将启动 @X" / backlog parked).
- single status change stays a direct apply (unchanged).
- timeline: render type='handoff' as a distinct, non-interactive handoff card.
- i18n run_confirm + handoff_card across en/ja/ko/zh-Hans; drop backlog action
  keys; locale parity green.

Tests: use-issue-actions (assign → run-confirm modal, member → direct),
create-issue + comment-card suites updated/green; views typecheck + lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): use a valid anchor in the handoff count-exclusion test

CountNewCommentsSince filters id <> @anchor_id; SQL id <> NULL is NULL and
excludes every row, so an empty anchor made the control assertion read 0. The
production caller always passes a real anchor — mirror that with a non-matching
sentinel uuid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): RunConfirmModal apply logic (start/suppress/note-gate/batch)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(core): preview schema malformed/missing/null fallback coverage

Cover IssueTriggerPreviewSchema via parseWithFallback (MUL-3375): well-formed
parse, top-level + item default fills (empty/older backend), and fallback to
{ triggers: [], total_count: 0 } for malformed shapes, a dropped required
issue_id, a wrong-typed total_count, and null/non-object bodies — so the four
entry points degrade to "nothing will start" instead of throwing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(issues): remove display-only handoff timeline record (留痕)

The handoff "留痕" timeline record (type='handoff' comment written on run
start) was judged superfluous and dropped per product call. This removes
only the display-only trace; the handoff NOTE injection into the run's
opening prompt + issue_context.md is untouched.

- backend: drop RecordHandoff + its call in dispatchIssueRun
- db: drop the `type <> 'handoff'` exclusion in CountNewCommentsSince and
  migration 123 (comment_type_check reverts to the 4-type set from 001);
  no production data exists for this unreleased feature
- frontend: drop the "handoff" CommentType, HandoffCard, and handoff_card
  i18n (all locales)
- tests: drop handoff_count_test.go and the record-write assertions in
  issue_trigger_preview_test.go (note-injection tests retained)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): dismissable run-confirm modal + team-handoff copy

Two fixes to the pre-trigger confirm modal (MUL-3375).

1. Dismissable: switch RunConfirmModal from AlertDialog to the standard
   shadcn Dialog so it has the close (X) button + Esc + click-outside.
   Previously the only choices were "start" / "don't start now" with no
   way to abort the action entirely; dismissing now cancels with no write.

2. Copy: rework the action-surface wording away from the backend term
   "run" toward team-handoff voice — 指派 / 开始 / 交接 (run stays only on
   record surfaces). Unifies the note's three names to "交接说明", and
   parallels the rewrite across en/ja/ko.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* chore(agent): bump handoff note min CLI version to 0.3.28

The daemon release that renders handoff notes ships in 0.3.28 (0.3.27
was the prior tag), so move the soft-gate threshold up. Below this the
note is silently dropped and the frontend grays the note box — assignment
is never blocked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(issues): skip run-confirm when batch-moving issues to backlog

A move into backlog never starts a run (service/issue_trigger.go), so the
pre-trigger confirm modal degenerated to an empty "won't start" box with a
single Apply button — pure friction. Apply directly instead, matching the
single-issue status path. Other target statuses still route through the
modal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(issues): refine pre-trigger preview hint and copy

- Move the create-issue run hint to a reveal band (grid 0fr→1fr) above the
  property toolbar. It was sharing the footer button row and, lacking a
  width constraint, reflowed the submit buttons whenever it appeared.
  Restyle to a borderless, comment-style avatar+caption that is purely a
  caption (non-interactive avatar).
- Distinguish squad from agent in the pre-trigger copy: a squad's leader
  evaluates and delegates rather than "starting work" itself. Add
  will_start_named_squad / will_start_squad / create_will_start_squad across
  en/zh/ja/ko (reusing the squad_leader_* evaluate→arrange vocabulary) and
  branch run-confirm + the create hint on squad assignees.
- Bold the assignee name in the run-confirm headline via a language-safe
  sentinel split (no per-language prefix/suffix keys).
- Align zh "开始处理" → "开始工作" on the single-assign copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(issues): stub ActorAvatar in create-issue suite

CreateRunHint now renders an ActorAvatar for agent/squad assignees, which
pulls in getActorInitials/getActorAvatarUrl + the workspace/presence/navigation
hook tree. This form-focused suite only stubbed getActorName, so the
squad-forwarding test crashed with "getActorInitials is not a function". Stub
the avatar inert — its own behavior is covered elsewhere.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Walt <walt@multica.ai>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 13:17:13 +08:00
honki12345
4679217586 feat(cli): STR-208 오토파일럿 구독자 플래그 추가 (#4438)
* feat(cli): STR-208 오토파일럿 구독자 플래그 추가

* test(core): Issue fixture stage 기본값 추가

* test(views): Issue fixture stage 기본값 추가
2026-06-23 11:56:09 +08:00
Naiyuan Qing
45dae3185f fix(issues): eliminate optimistic-update drag flicker (board, list, batch, WS) (#4415)
* fix(issues): stop kanban card snapping back on drag

A cross-column drag on a non-position-sorted board left the card in its
origin column for the whole request, then jumped to the target only when
the mutation settled — the "snaps back, then moves" glitch. Root cause was
three coupled choices in the optimistic path:

- board-view never updated local columns on drop for sortBy != "position"
  (onDragOver is a no-op there), so the card relied on the settle refetch
  to move across.
- useUpdateIssue invalidated the whole list on settle, replacing the column
  and re-landing the card even on success.
- patchIssueInBuckets appended a moved card to the column tail instead of
  its position slot, so any later cache refresh teleported it to the end.

Fixes:
- board-view: optimistically move the card into the target column on drop
  for the non-position path (insertIdByPosition), and reconcile local
  columns from the cache on settle for both paths (revert on error now that
  the list is no longer refetched).
- mutations: reconcile via onSuccess surgical patch of the returned entity;
  drop the list/detail invalidation from onSettled (aggregates still flush).
- cache-helpers: patchIssueInBuckets inserts the moved/reordered card at its
  position slot; a plain field update still keeps its slot.

Adds cache-helpers and drag-utils unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): patch My-Issues / Project board caches on move too

The drag fix made the board reconcile local columns from its feeding cache
on settle. The workspace board rides issueKeys.list (patched by onMutate),
but the My-Issues and Project boards ride the myList cache, which the
mutation did not patch — so a successful move snapped back on those boards.

useUpdateIssue now patches/snapshots/rolls back every bucketed list cache
(workspace list + myList), selected by the ListIssuesCache `byStatus` shape
so grouped (assignee) and flat (gantt) caches are skipped. Adds renderHook
regression tests covering both-cache optimistic move, both-cache rollback,
and no-list-invalidation-on-settle.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): drop redundant WS position->list invalidate

onIssueUpdated already surgically patches the non-filtered workspace board
via patchIssueInBuckets (cross-status move + same-column reorder). The extra
`if (position) invalidateQueries(list)` re-pulled the whole board on top of
that, re-introducing drag flicker through the echoed-back WS event. Removed.
Filtered myAll lists still invalidate (membership can change there) — the
client-side membership reconciliation for those is a separate follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): batch update patches myList + stops list refetch on settle

- onMutate now patches both issueKeys.list and the filtered issueKeys.myAll
  bucketed caches, so a batch edit on a My-Issues / Project board is
  optimistic too. Previously only the workspace board was patched, so batch
  edits on those boards relied entirely on the settle refetch.
- onSettled no longer invalidates issueKeys.list: the optimistic patch is a
  complete reconcile for these bucketed boards (batch changes status /
  priority / project, never a server-computed value), so a full-board
  refetch only re-introduced the flicker the single-issue path removed.
  Aggregate / grouped caches still refresh.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): list view optimistically moves row on non-position drag

The sortBy != "position" branch called onMoveIssue without moving the row in
local columns, so the row sat in its origin group for the whole request and
only jumped across on settle -- the same snap-back the board view had before
its fix. Now mirrors board-view: setColumns(insertIdByPosition) on drop so
the settle rebuild is a visual no-op.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): keep My-Issues/Project boards in place on non-membership WS change

onIssueUpdated now surgically patches the filtered myList (myAll) caches and
only invalidates them when the change can actually move an issue in/out of the
filter: an assignee change (covers My-Issues direct-assignee + the involves leg
+ actor panels) or a project change (Project board). A pure status / position /
priority / label change reconciles in place -- no refetch -- removing the last
drag flicker on filtered boards.

Uses the assignee_changed flag the server already sends on issue:updated
(surfaced on IssueUpdatedPayload + forwarded by the realtime dispatch); project
change is diffed client-side against the cached value. No predicate replication,
no backend change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): add settle-lock to swimlane drag (no clobber mid-flight)

The swimlane drag had no settle window: the resync useEffect (and the issueMap
freeze) guarded only isDraggingRef, so a cache change landing after drop but
before the move settled could rebuild localCells out from under the optimistic
move. Adds isSettlingRef + settleVersion (mirroring board-view / list-view): the
lock is held from drop until onMoveIssue settles, then released, forcing a
single resync from the reconciled cache.

onMoveIssue now accepts the same optional onSettled callback board/list already
use; the parent handleMoveIssue supplies it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(issues): extract shared useDragSettle hook for board + list

board-view and list-view carried byte-identical drag/settle scaffolding (the
local columns mirror, the dragging/settling locks, the post-move animation-frame
throttle, and the settle callback). That duplication is exactly what let
list-view silently drift earlier (it had lost the optimistic-move half of the
fix, and its position-branch settle callback omitted the settleVersion bump).
Extract the primitive into useDragSettle so both surfaces share one
implementation and can't drift again.

Behavior-preserving for board-view. For list-view the one intended alignment:
its position-branch failed move now reverts, gaining the settleVersion bump
board-view already had.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 09:20:01 +08:00
Bohan Jiang
48b8dbf439 feat(daemon): surface sub-issue stages in the always-on runtime brief (#4426)
Agents creating sub-issues only saw the runtime brief's Sub-issue
Creation section, which taught the manual todo/backlog serial chain and
never mentioned stages — the `--stage` flow was documented only in the
multica-working-on-issues skill, which an agent reads only if it opens
it. So agents defaulted to hand-managed backlog chains and rarely reached
for stages.

- Add an "Ordering with stages" paragraph to the brief's Sub-issue
  Creation section nudging agents to group ordered/waiting sub-issues
  with --stage instead of hand-promoting a backlog chain.
- List --stage on the brief's issue create / update command lines and
  add multica issue children to the Core command list for discoverability.
- Extend the brief test with the new stage assertions.

The Sub-issue Creation section stays gated to issue-bound runs (skipped
for chat/quick-create/autopilot), unconditional on parent_issue_id, and
free of parent-notification guidance — all existing canaries still pass.

MUL-3508

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 01:08:10 +08:00
Bohan Jiang
3a814bd74a fix(issues): refine Stage field icon and dropdown font (#4425)
Replace the # (Hash) icon for the Stage property with the Milestone icon
across the picker trigger, dropdown option rows, and the Add-property menu.

Shrink the Stage dropdown option font to text-xs (scoped to the Stage
picker only; the shared PickerItem keeps text-sm so other property
dropdowns are unaffected).
2026-06-23 00:41:36 +08:00
Bohan Jiang
8122ee3bcd MUL-3528: clarify repo-scoped skills docs (#4424)
* docs: clarify repo-scoped skills

Co-authored-by: multica-agent <github@multica.ai>

* docs: mark repo-scoped skills as expected

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 00:33:41 +08:00
Bohan Jiang
a123dfc2df MUL-3508: stage sub-issues so the parent wakes per stage, not per child (#4410)
* feat(issues): stage sub-issues so the parent wakes per stage, not per child

Sub-issues under a parent can be grouped into ordered stages (issue.stage).
The child-done -> parent notification + assignee wake now fire only when a
stage barrier closes: every sub-issue in the lowest unfinished stage has
reached a terminal status (done/cancelled). An unstaged sibling set is one
implicit stage, so the parent is woken once when the last sub-issue finishes
instead of on every child — the default fix for the fire-on-every-child
cascade reported in discussion #4320 / MUL-3508.

Stage advancement stays agent-driven: the server only detects the closed
barrier and wakes the parent assignee, who decides whether to promote the
next stage.

- DB: nullable issue.stage (CHECK >= 1) + sqlc regen
- API: stage on issue create/update/response and batch update
- CLI: `issue create`/`issue update` --stage; new `issue children` command
  that lists sub-issues grouped by stage (table + json)
- stageBarrierClosed / stageProgressSummary in issue_child_done.go, with the
  wake comment now stage-aware, plus unit tests
- skill docs (multica-working-on-issues SKILL.md + source map)

Web UI (create-form stage picker, sidebar edit, group-by-stage display) is a
follow-up; the API already returns stage for it to consume.

MUL-3508

Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): address review on stage barrier (cancel, batch, unstaged)

Resolves the three blockers from the PR review:

1. Cancel can close a stage. The child-done barrier now fires on any
   non-terminal -> terminal transition (done OR cancelled), not just done.
   isTerminalChildStatus already treats cancelled as terminal (a cancelled
   sibling never finishes, so it must not hold a stage open), so a cancelled
   last-open child now closes its stage and wakes the parent. Keying on the
   transition also makes a later cancelled -> done edit a no-op, avoiding a
   lagging duplicate wake.

2. Batch update of stage no longer no-ops. `hasMutation` now includes
   "stage", so `{"updates":{"stage":N}}` persists instead of returning
   {"updated": 0}.

3. Unstaged children no longer participate in the staged frontier. In a
   staged sibling set, NULL-stage children neither hold a stage open nor fire
   on their own completion, and the wake comment no longer renders "Stage 0".
   This matches migration 123 ("NULL does not participate in staged
   grouping") and the CLI's separate unstaged group, removing the footgun
   where an unstaged backlog child silently blocked Stage 1.

Tests: cancellation closes a stage (staged + unstaged), unstaged ignored in a
staged set, stage summary skips unstaged, and a stage-only batch update
persists.

MUL-3508

Co-authored-by: multica-agent <github@multica.ai>

* feat(web): stage UI — create picker, sidebar edit, group sub-issues by stage

Frontend for the sub-issue stage feature (web + desktop, shared via packages):

- core: `stage` on the Issue type + create/update request types; zod
  IssueSchema parses it (defaults to null for older backends) with schema
  tests for the numeric and omitted cases.
- StagePicker component (mirrors the other property pickers): "No stage" +
  Stage 1..N, offering one beyond the current/sibling max.
- Create-issue modal: a Stage pill, shown only when a parent is selected,
  threaded into the create payload.
- Issue detail sidebar: an editable Stage row + "add property" entry, gated to
  sub-issues (issues with a parent).
- Sub-issue list grouped by stage with per-stage headers (flat when unstaged).
- i18n: stage keys across en / zh-Hans / ja / ko (parity test passes).

Verified: full typecheck (6/6), core (591) + views (1433) vitest suites,
lint clean (no new findings). Backend/CLI shipped earlier in this PR.

MUL-3508

Co-authored-by: multica-agent <github@multica.ai>

* test(issues): add stage to Issue fixtures merged from main

The merge brought in new Issue fixtures that predate the required `stage`
field: core issues/batch.test.ts, views batch-action-toolbar.test.tsx, and
the mobile EMPTY_ISSUE_FALLBACK sentinel. Add `stage: null` so they satisfy
the Issue type (mobile reuses core's IssueSchema for parsing, so only the
sentinel needs it).

MUL-3508

Co-authored-by: multica-agent <github@multica.ai>

* fix(web): feed StagePicker the sibling max stage so higher stages stay selectable

The StagePicker accepts maxStage to extend its option list beyond the
floored Stage 1-3, but neither call site passed it, so a parent with an
existing Stage 4/5 child could not pick that stage when creating a new
sub-issue or editing one in the sidebar.

- Compute the sibling max stage at both call sites: the create modal now
  loads the parent's children (childIssuesOptions) and the detail sidebar
  reuses the already-loaded parentChildIssues.
- Extract maxSiblingStage + stageOptions as pure helpers on stage-picker
  and unit-test them (the regression: a Stage 5 sibling keeps Stage 5
  selectable and offers Stage 6).

MUL-3508

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 00:14:42 +08:00
Ivan Fokeev
ca43c83abc MUL-3523: fix(github): route PR/check_suite webhooks by repo
Fix GitHub pull_request and check_suite webhook routing so events are attributed to the workspace that registered the repository, with fallback to the installation workspace. Includes host-qualified repo matching, account-gated registry routing, deterministic matching, and regression coverage.
2026-06-22 23:44:46 +08:00
Bohan Jiang
da72e2fa22 feat(daemon): inject project description into the agent brief (MUL-3465) (#4395)
* feat(daemon): inject project description into the agent brief

Issues bound to a project only surfaced the project title in the runtime
brief; the project description (durable, project-wide context the owner
sets) was loaded but dropped. Carry it end-to-end:

- claim handler reads proj.Description onto the response (issue-bound and
  quick-create paths)
- new ProjectDescription field on AgentTaskResponse, daemon Task, and
  TaskContextForEnv
- rendered in the brief's `## Project Context` section and written to
  .multica/project/resources.json as project_description

Empty descriptions render nothing (no extra heading). Updated the
projects-and-resources built-in skill docs in the same change.

MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

* feat(projects): clarify project description is injected as agent context

The project description is now durable context injected into every task's
brief, but the UI still presented it as a plain "Description" field, so
existing descriptions could silently become agent input. Add a hint under
the description editor on the project detail page and in the create-project
modal, in all four locales, stating it is shared with agents as context for
every task in the project. No data-semantics change.

Addresses review feedback on PR #4395. MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): assert project description flows through task claim

The execenv tests cover brief rendering, but nothing pinned the claim
handler boundary where proj.Description is read onto the response. Add
two tests — issue-bound and quick-create paths — so a regression in that
assignment fails loudly instead of silently dropping the description.

Addresses review feedback on PR #4395. MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 23:39:27 +08:00
DylanLi
78342a39ce MUL-3305: feat(agent): add qoder CLI as a choice of agent provider. (#2461)
* feat(agent): Qoder ACP runtime, chat reconnect recovery, and task linkage

- Add Qoder CLI backend (ACP transport, model discovery, blocked-args policy)
- Wire daemon/runtime config, docs, and UI provider assets
- Retry terminal task reports; add backoff unit tests
- Chat: SQL attach user message to task; handler + optimistic cache reconcile
- Invalidate chat/task-messages caches on WS reconnect; extract helper + tests

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>

* chore: drop non-Qoder changes (chat reconnect, task link, terminal report retries)

Keep only Qoder runtime, docs, daemon config/execenv, and UI provider assets.

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(agent): harden Qoder ACP drain and wire project skills path

- Stop streaming to msgCh after reader wait so grace timeout cannot race close
- Resolve injected skills to .qoder/skills per Qoder CLI discovery
- Update AGENTS.md skill copy and add execenv tests

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(qoder): add provider logo and wire MCP config into ACP sessions

- Add inline SVG QoderLogo component to provider-logo.tsx, replacing
  the generic Monitor icon placeholder
- Add convertMcpConfigForACP helper to convert Claude-style MCP server
  config (object map) into ACP array format for session/new and
  session/resume
- Add unit tests for convertMcpConfigForACP covering stdio, SSE,
  empty/nil, and multi-server cases

Co-authored-by: Orca <help@stably.ai>

* fix(test): capture both return values from InjectRuntimeConfig in Qoder test

Co-authored-by: Orca <help@stably.ai>

* fix(qoder): preserve remote MCP headers and promote provider errors

Addresses review feedback on #2461 (Bohan-J): two runtime-correctness
issues in the Qoder ACP backend.

1. Remote MCP headers were dropped. The bespoke convertMcpConfigForACP
   only forwarded url/type, so an authenticated remote MCP server looked
   configured in Multica but failed inside the Qoder session. Replace it
   with the shared buildACPMcpServers helper (same path Hermes/Kimi/Kiro
   use), which preserves headers as [{name, value}], sorts for
   deterministic output, and handles remote transport aliases. Fail
   closed on malformed mcp_config instead of silently dropping servers.

2. Provider failures could report as completed tasks. stderr was wired
   via io.MultiWriter and the result was only promoted to failed when
   output was empty, so a terminal upstream error (HTTP 429 / expired
   token) racing a stopReason=end_turn with text still became
   "completed". Switch to StderrPipe + an explicit copier, drain it
   (bounded by the existing grace window, since qodercli can leave a
   child holding the inherited fds) before the decision, and run the
   shared promoteACPResultOnProviderError.

Tests: replace the convertMcpConfigForACP unit tests with two
end-to-end Qoder tests — one asserts the Authorization header reaches
the session/new payload as {name, value}, the other asserts a terminal
stderr error with non-empty output reports failed.

Co-authored-by: Orca <help@stably.ai>

* fix(qoder): align ACP session handling

Co-authored-by: Orca <help@stably.ai>

* fix(agent): guard qoder late output after drain

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 18:55:45 +08:00
Naiyuan Qing
149cc9bd0a fix(issues): reflect real common value in batch toolbar pickers (#4403)
The batch action toolbar hardcoded status="todo", priority="none", and a
null assignee, so the status/priority/assignee pickers always checked a
fixed row regardless of the selected issues. The batch write itself worked,
but the picker mis-reported the current value, surfacing as "status always
defaults to todo" (MUL-3510). The same defect applied to priority and
assignee, across all five toolbar mount points.

Derive the shared status/priority/assignee of the selected issues via a new
commonIssueFields helper and feed it to the pickers; when the selection is
mixed, pass an empty value so no row is checked. Pickers now accept a
nullable current value, and AssigneePicker gains a `mixed` flag to
distinguish an all-unassigned selection (check "No assignee") from a mixed
one (check nothing). Each call site passes its issue universe, mirroring the
skill list's selected-rows approach.

Adds unit tests for commonIssueFields and a toolbar picker-wiring test.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 18:23:05 +08:00
Bohan Jiang
637b6ee433 feat: add CLI comment resolve commands (#4404)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 18:16:38 +08:00
Multica Eve
b5a180b21e docs: update June 22 changelog (#4406)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
v0.3.27
2026-06-22 17:30:31 +08:00
YYClaw
329384f052 chore(makefile): expand clean target to remove build caches (#4394)
Extend the clean target beyond server binaries/temp files to also
remove Next.js/source/Expo/Electron build outputs, Turbo caches, and
tsbuildinfo files across apps and packages.
2026-06-22 17:20:42 +08:00
Jiayuan Zhang
916cee5c5d feat(issues): open agent activity chip on hover (#4405)
The header 'agent is working' chip previously required a click to reveal the
activity card. Open it on hover instead so the live signal reads as a
glanceable status surface. The hover config lives on Base UI's Popover.Trigger
(openOnHover + delay/closeDelay), and the trigger stays a real button so
click/keyboard access is retained for touch and a11y.

Add a regression test asserting openOnHover is wired on the trigger so a
click-only implementation can no longer pass.

MUL-3507

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 17:06:15 +08:00
Naiyuan Qing
4fe8b54e9b MUL-3446: keep chat output in chat (#4387)
* MUL-3446: keep chat output in chat

Co-authored-by: multica-agent <github@multica.ai>

* MUL-3446: simplify chat output guidance

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 15:51:03 +08:00
BeliyDym
5fd3d01d13 MUL-3502: OST-1161: Bound assignment comment catch-up
Squashed PR #4392. Updates assignment/comment catch-up guidance to use recent 10 and aligns related examples.
2026-06-22 15:46:47 +08:00
Jiayuan Zhang
cf30991f91 feat(sidebar): add dismissible Join Discord card (#4400)
Add a Join Discord promo card pinned to the bottom of the left sidebar
(above the help launcher). Dismiss state persists per-user in
localStorage so it stays hidden once closed.

Extract the shared DiscordIcon + invite URL into layout/discord.tsx so
the help launcher and the card reuse one source. i18n copy added for
en / zh-Hans / ja / ko.

MUL-3505

Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 09:35:41 +02:00
YOMXXX
9d053c57f9 MUL-3420: fix(runtimes): clarify custom runtime deletion 2026-06-22 15:01:22 +08:00
Bohan Jiang
8a9f15dbc9 feat: add Discord community entry points (#4388)
Add a Discord invite (https://discord.gg/W8gYBn226t) in three places:
- Website footer: social icon + link in the Resources group (en/zh/ja/ko)
- In-app help menu: Discord item in the help launcher (all 4 locales)
- GitHub repo README: badge + link (README.md and README.zh-CN.md)

MUL-3492

Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 14:00:36 +08:00
Bohan Jiang
5556f4570b fix(issue): skip child-done notification when parent is in backlog (#4391)
When a sub-issue transitions to done, notifyParentOfChildDone posts a
system comment on the parent and wakes the parent's assignee. A parent
deliberately parked in backlog should stay inert: waking it lets the
assignee promote sibling backlog sub-issues into todo, which is the
unwanted auto-activation reported in GitHub Discussion #4320 (MUL-3497).

Add a backlog guard alongside the existing done/cancelled guard so the
whole notification (comment + mention + trigger) is skipped until the
user explicitly moves the parent out of backlog.

MUL-3497

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 13:56:48 +08:00
Bohan Jiang
b13e1808a4 refactor(codex): make permission approval auto-grant observable (#4390)
The daemon auto-grants Codex item/permissions/requestApproval requests by
echoing back the network / fileSystem profile scoped to the current turn.
Previously a malformed params payload and any permission key outside
network / fileSystem were dropped silently, so a future app-server
protocol that adds a new permission shape would be narrowed away with no
trace in daemon logs.

Log both cases (parse failure and dropped keys) without changing the
granted response. Addresses review nits on #4346 / MUL-3451.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 13:43:42 +08:00