Commit Graph

160 Commits

Author SHA1 Message Date
Bohan Jiang
ec73710dd2 fix(agent/codex): surface stderr tail in initialize / turn startup errors (#1314)
* fix(agent/codex): surface stderr tail in initialize / turn startup errors

When codex app-server exits before the JSON-RPC handshake completes —
e.g. because the user put a flag in custom_args that the subcommand
rejects — the Result.Error users see is `codex initialize failed:
codex process exited`, while codex's actual complaint (typically
something like `error: unexpected argument '-m' found`) only lives in
daemon logs.

Wrap the stderr writer with a bounded stderrTail that still forwards
to the slog logWriter but also retains the last 2 KiB of bytes
written. Include that tail on the three startup failure paths
(initialize, startOrResumeThread, turn/start). Runtime cancellation
paths are left untouched — they're our own abort and the stderr
context isn't a clear signal there.

Refs #1308. Complement to #1310 / #1312 — lets "bad custom_args fail
loudly" actually be workable by giving the failure a real message.

* fix(agent/codex): join cmd.Wait() before sampling stderr tail

Addressing review of #1314: reading stderrBuf.Tail() right after
c.request returns "codex process exited" was racy. Nothing in that
path synchronizes with os/exec's internal stderr copy goroutine —
cmd.Wait() is the only documented join point. The original defer ran
cmd.Wait() later, but by then we had already built Result.Error from
a potentially-empty Tail().

Replace the ad-hoc deferred stdin.Close()/cmd.Wait() with a
sync.Once-wrapped drainAndWait closure. Call it explicitly on the
three startup failure paths before sampling the tail; keep it as the
cleanup defer so the success path behaves identically.

Also add TestCodexExecuteSurfacesStderrWhenChildExitsEarly: spawns a
real subprocess that prints to stderr and exits before responding to
initialize, runs it through Execute, and asserts Result.Error
contains the stderr hint. This covers the full timing path the
reviewer flagged, which the helper-level tests in this PR did not.
2026-04-20 14:38:32 +08:00
Bohan Jiang
bd445782d5 fix(openclaw): stop passing unsupported flags and actually deliver AgentInstructions (#1362)
Fixes #1332.

Two regressions introduced in #910 (2026-04-14, "OpenClaw backend P0+P1
improvements") that together block all openclaw users:

1. `openclaw agent` does not accept `--model` or `--system-prompt`, so
   any agent configured with a Model field crashed in ~700ms with
   `exit status 1`. Remove both forwards, and add them to
   openclawBlockedArgs so custom_args can't reintroduce the crash.
   Model is bound at registration time via `openclaw agents
   add/update --model`.

2. AgentInstructions were written to `{workDir}/AGENTS.md` by
   execenv.InjectRuntimeConfig, but openclaw loads bootstrap files
   from its own workspace dir — the file was never read, so every
   agent's Instructions field was silently discarded. Populate
   opts.SystemPrompt for the openclaw provider in runTask and
   prepend it to the `--message` payload in the backend so the
   model actually receives the instructions.

Other providers surface instructions through their native runtime
config file (CLAUDE.md / AGENTS.md / GEMINI.md) and are intentionally
left unchanged to avoid double injection.

Extract buildOpenclawArgs so arg construction is directly testable;
add unit tests covering the removed flags, the SystemPrompt prepend,
and custom_args filtering.
2026-04-20 14:01:41 +08:00
devv-eve
5fa1da448f fix(chat): preserve chat session resume pointer across failures (#1360)
* fix(chat): preserve chat session resume pointer across failures

The chat 'forgets earlier messages' bug came from PriorSessionID being
silently lost in several edge cases:

- UpdateChatSessionSession unconditionally overwrote chat_session.session_id,
  so any task that completed without a session_id (early agent crash,
  missing result) wiped the resume pointer to NULL.
- CompleteAgentTask + UpdateChatSessionSession ran in separate calls. A
  follow-up chat message claimed in between resumed against a stale (or
  NULL) session and started over.
- FailAgentTask never wrote session_id back, so a task that established
  a real session before failing lost its resume pointer.
- ClaimTaskByRuntime only trusted chat_session.session_id and never
  fell back to the existing GetLastChatTaskSession query, so a single
  bad turn could permanently drop the conversation memory.

This change:

- Use COALESCE in UpdateChatSessionSession so empty inputs preserve the
  existing pointer; surface DB errors instead of swallowing them.
- Run CompleteAgentTask/FailAgentTask + UpdateChatSessionSession inside
  the same transaction (TaskService now takes a TxStarter).
- Extend FailAgentTask + the daemon FailTask path (client, handler,
  service) to forward session_id/work_dir, so failed/blocked tasks that
  built a real session still record it.
- Fall back to GetLastChatTaskSession in ClaimTaskByRuntime when the
  chat_session pointer is missing, and include failed tasks in that
  lookup so a single failure can't lose the conversation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(daemon): forward session_id/work_dir on blocked + timeout paths

runTask previously dropped result.SessionID and env.WorkDir on the
non-completed return paths:

- timeout returned a naked error, so handleTask called FailTask with
  empty session info and the chat resume pointer was either left stale
  or eventually overwritten with NULL.
- blocked / failed (default branch) returned a TaskResult without
  SessionID / WorkDir, so even though FailTask now COALESCEs into
  chat_session, there was no value to write through.
- the empty-output completion path was the same: it raised an error
  even when a real session_id had been built.

All three paths now return a TaskResult that carries the SessionID /
WorkDir the backend produced. Combined with the COALESCE-based update
in UpdateChatSessionSession and the FailTask plumbing introduced in
PR #1360, the next chat turn can always resume from the latest agent
session — even when the previous turn timed out, was rate-limited, or
returned an empty completion — instead of starting over with no memory
of the conversation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(copilot): capture session id from session.start as fallback

The Copilot backend only read sessionId from the synthetic 'result'
event, ignoring the one already present on session.start. When the CLI
was killed before result arrived (timeout, cancel, crash, or a
session.error mid-turn), the daemon reported SessionID="" and the
chat-session resume pointer could not advance — causing the chat to
silently drop conversation memory on the next turn.

Capture session.start.sessionId into state up front, and only let
'result' overwrite it when it actually carries one. result still wins
when present (it is the authoritative end-of-turn record).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(copilot): parse premiumRequests as float to preserve session id

Copilot CLI v1.0.32 serializes premiumRequests as a float (e.g. 7.5),
not an integer. Our copilotResultUsage struct typed it as int, which
made the entire 'result' line fail json.Unmarshal — silently dropping
sessionId on every turn.

This was the real cause of chat memory loss: the daemon reported
SessionID="" to the server, chat_session.session_id stayed NULL, and
the next chat turn never received --resume <id>, so each turn started
a fresh Copilot session with no prior context.

Add a regression test using the real JSON line from CLI v1.0.32 that
asserts sessionId is preserved when premiumRequests is fractional.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: yushen <ldnvnbl@gmail.com>
2026-04-19 22:50:33 -07:00
Bohan Jiang
163f34f918 feat(agents): show launch mode preview in custom args tab (#1312)
* feat(agent): add LaunchHeader per agent type

Each backend in server/pkg/agent/ hardcodes a stable command skeleton
(e.g. `codex app-server --listen stdio://`, `hermes acp`) before
appending opts.CustomArgs. Surfacing that skeleton lets the UI tell
users which command their custom_args are being appended to, so a
Codex user doesn't mistakenly add `-m gpt-5.4-mini` expecting it to
reach the CLI when the subcommand is actually `app-server`.

Expose only the minimum that aids judgment — binary + subcommand, or a
short mode label when there is no subcommand — and deliberately omit
transport values, internal flags, and env to keep the surface small
and renaming-safe.

Refs #1308.

* feat(handler/runtime): surface launch_header on runtime response

runtimeToResponse now derives launch_header from agent.LaunchHeader,
piggybacking on the runtime's existing provider field so the
frontend's RuntimeDevice gains the skeleton without a new endpoint or
DB query. Client gets the header for free whenever it lists agents'
runtimes — which the custom-args tab already does.

Refs #1308.

* feat(ui/agents): show launch mode preview in custom args tab

Thread the resolved RuntimeDevice from AgentDetail into CustomArgsTab
and render its launch_header as a one-line preview above the args
list, so users see `codex app-server <your args>` (or equivalent per
provider) and can tell whether a CLI-style flag like `--model` will
actually reach the invoked subcommand. Source of truth stays in the
Go backend; the TS type just carries the string.

Refs #1308.
2026-04-18 14:18:42 +08:00
niceSprite
746f33a38b fix(claude): clear fresh session_id on resume failure so daemon fallback fires (#1285)
When --resume targets a dead session, claude prints
"No conversation found with session ID: ..." to stderr, emits a stream-json
system init with a fresh session_id, then exits with code 1. The backend
was treating that fresh id as the authoritative session, so
daemon.go's retry-with-fresh-session fallback (SessionID == "" guard)
never triggered. Every subsequent task for the same (issue, agent) pair
stayed permanently broken until the server-side session_id was cleared by
hand.

Fix: when --resume was requested but the emitted session_id differs AND
the run failed, drop the fresh id from Result so the daemon's existing
fallback can do its job. Factored into a pure helper and unit-tested.

Fixes #1284

Co-authored-by: fuxiao <fuxiao@zyql.com>
2026-04-18 12:59:30 +08:00
Korkyzer
63800f05ff fix(agent): add per-agent mcp_config field to restore MCP access (#1168)
* fix(agent): add per-agent mcp_config field to restore MCP access

Closes #1111

The --strict-mcp-config flag was added defensively in #592 to prevent
Claude agents from inheriting MCP state from the outer Claude Code session.
It was meant to be paired with --mcp-config <path> to inject a controlled
set of MCPs, but that path was never implemented, which silently stripped
all user-scope MCPs from spawned agents.

This PR completes the original design by:

- Adding a nullable mcp_config jsonb column to the agents table
- Wiring mcp_config through AgentResponse, Create/Update requests
- Piping it into ExecOptions.McpConfig in the daemon
- Serializing to a temp file and passing --mcp-config <path> in buildClaudeArgs
- Blocklisting --mcp-config in claudeBlockedArgs to prevent override
  via custom_args

Does not touch Codex provider (tracked separately in #674).
Does not implement Multica MCP auto-injection (out of scope).

* fix: disambiguate JSON null vs absent for mcp_config
2026-04-18 01:35:22 +08:00
Bohan Jiang
a73336dcf8 feat(daemon): persistent UUID identity + legacy-id merge at register-time (#1220)
* feat(daemon): persistent UUID identity + legacy-id merge at register-time

daemon_id is now a stable UUID persisted to `<profile-dir>/daemon.id` on
first start, replacing the hostname-derived id that drifted whenever
`.local` appeared/disappeared, a system was renamed, or a profile
switched — each of which used to mint a fresh `agent_runtime` row and
strand agents on the old one.

To migrate existing installs without operator intervention, the daemon
reports every legacy id it may have registered under previously
(`host`, `host` with `.local` stripped, and `host[-profile]` variants
for both). At register-time the server looks up each candidate row
scoped to (workspace, provider), re-points its agents and tasks onto
the new UUID-keyed row, records which legacy id was subsumed in the
new `legacy_daemon_id` column for audit, and deletes the stale row.
Result: users running `xxx.local`-keyed runtimes today transparently
land on the new UUID row on next daemon restart.

The hostname-prefix `MigrateAgentsToRuntime` / `daemon_id LIKE '...-%'`
compatibility shim is no longer needed and has been removed along with
the handler call that invoked it.

* fix(daemon): handle bidirectional .local drift and case drift in legacy merge

Review on #1220 flagged two gaps in the legacy-id migration candidate set:

1. Reverse .local: LegacyDaemonIDs only added the stripped variant when the
   current hostname ended in `.local`. The opposite direction — DB has
   `foo.local`, current host is `foo` — was missed, so runtimes registered
   under the `.local` variant stayed orphaned after upgrade. Now both
   variants (`foo` and `foo.local`) are always emitted, regardless of what
   `os.Hostname()` currently returns, plus their `-<profile>` suffix forms.

2. Case drift: os.Hostname() has been observed returning different casings
   on the same machine across mDNS/reboot state. A case-sensitive `=`
   comparison stranded rows like `Jiayuans-MacBook-Pro.local` when the
   daemon later reported `jiayuans-macbook-pro.local`. FindLegacyRuntimeByDaemonID
   now uses `LOWER(daemon_id) = LOWER(@daemon_id)` on both sides, so casing
   differences merge rather than orphan. The (workspace_id, provider) prefix
   still bounds the scan to a tiny set of rows so the non-indexed LOWER()
   comparison has negligible cost.

Tests: TestLegacyDaemonIDs gets the mixed-case + reverse-direction cases;
daemon_test.go adds TestDaemonRegister_MergesLegacyDaemonIDRuntime_ReverseDotLocal
and TestDaemonRegister_MergesLegacyDaemonIDRuntime_CaseDrift.

* fix(daemon): consolidate every case-duplicate legacy runtime, not just the first

Follow-up review on #1220: after switching to `LOWER(daemon_id) =
LOWER(@daemon_id)`, the single-row lookup still only merged one legacy
row per candidate. If a machine already had two rows in the DB that
differed only in casing (e.g. `Jiayuans-MacBook-Pro.local` AND
`jiayuans-macbook-pro.local` coexisting because earlier hostname drift
already minted a duplicate), only one of them got consolidated and the
other stayed orphaned — violating the "no duplicate runtime per machine
after backfill" acceptance.

- FindLegacyRuntimeByDaemonID → FindLegacyRuntimesByDaemonID (:many)
- mergeLegacyRuntimes iterates every returned row and dedupes across
  overlapping legacy candidates so `foo` and `foo.local` both resolving
  to the same stored row don't double-process

Test: TestDaemonRegister_MergesAllCaseDuplicateLegacyRuntimes seeds two
case-duplicate rows with one agent each and confirms both rows are
deleted and both agents end up on the new UUID-keyed row.
2026-04-17 15:10:38 +08:00
Bohan Jiang
423ceaf8f4 test(agent): regression tests for codex subagent threadId filter (#1257)
Follow-up to #1192. Document the v2 protocol contract that the
dispatch-level threadId guard relies on, and lock down the two leakage
paths the guard closes:

- turn/completed from a subagent thread must not call onTurnDone
- item/completed (agentMessage, final_answer) from a subagent thread
  must neither leak text into the output builder nor terminate the turn

Without these tests a future refactor that drops or relocates the guard
would not be caught by CI, since existing notification tests omit the
top-level threadId field and pass through unfiltered.
2026-04-17 14:49:38 +08:00
niceSprite
462ff88df5 fix(codex): dispatch-level threadId filter for subagent notifications (#1192)
* fix(daemon): filter thread/status/changed by threadId to prevent subagent interference

When Codex CLI has memories enabled, the app-server spawns a memory
consolidation subagent as a separate thread within the same stdio
connection. When that subagent thread finishes and transitions to idle,
the daemon's codex backend mistakenly interprets the idle signal as the
main turn completing, causing it to close stdin and cancel the context
before the real turn produces any output.

Add a threadId check to the thread/status/changed handler so only
status changes from the tracked thread trigger turn completion. Signals
from subagent threads (threadId != c.threadID) are now ignored.

Fixes #1181

* fix(codex): dispatch-level threadId filter for subagent notifications

Codex multiplexes subagent threads (e.g. memory consolidation) on
the same stdio pipe. Previously only thread/status/changed had a
threadId guard, but item/completed (agentMessage + final_answer),
turn/completed, and turn/started from subagent threads could still
trigger onTurnDone or contaminate output.

Move the threadId check to the top of handleRawNotification so all
notification handlers are protected. Remove the now-redundant
per-handler check on thread/status/changed.

Fixes multica-ai/multica#1181

---------

Co-authored-by: fuxiao <fuxiao@zyql.com>
2026-04-17 14:45:09 +08:00
Bohan Jiang
d12d690c38 fix(usage): bucket workspace usage by task_usage.created_at, not enqueue time (#1176)
GetWorkspaceUsageByDay and GetWorkspaceUsageSummary had the same date
attribution bug as the runtime endpoint fixed in #1167: they bucketed
and filtered on agent_task_queue.created_at (enqueue time), so a task
that queued at 23:58 and reported usage at 00:05 was attributed to the
prior day, and ?days=N became a rolling now()-N window that clipped the
morning of the earliest day returned.

Switch both queries to task_usage.created_at (~= task completion time)
and snap the since cutoff to start-of-day via DATE_TRUNC, mirroring
ListRuntimeUsage.

These endpoints have no frontend caller today, but per offline
discussion they will back the upcoming workspace-level usage dashboard.
Fix preemptively so the dashboard inherits correct numbers.

Add a regression test covering both endpoints with the same
cross-midnight + earliest-day-cutoff scenarios used for runtime usage.
2026-04-16 19:06:49 +08:00
Bohan Jiang
a36252ca99 refactor(runtime): derive runtime usage from task_usage only (#1167)
* refactor(runtime): derive runtime usage from task_usage only

The daemon used to scan each runtime's local CLI log directory every 5
minutes (Claude Code, Codex, OpenCode, OpenClaw, Hermes) and post daily
aggregates to /api/daemon/runtimes/{id}/usage. Those directories are
shared with the user's own local CLI sessions, so the user's personal
usage was being counted as Daemon-executed usage. Cursor and Gemini had
no scanner at all, so their runtime-level aggregates were always zero.

Switch GetRuntimeUsage to aggregate task_usage (already scoped to
Daemon-executed tasks) via agent_task_queue.runtime_id. Single source of
truth; Cursor/Gemini/Copilot get runtime usage for free; no reliance on
external CLI log formats.

Removes:
- server/internal/daemon/usage/ (all scanners)
- Daemon.usageScanLoop + providerToRuntimeMap
- Client.ReportUsage
- ReportRuntimeUsage handler + POST /api/daemon/runtimes/{id}/usage
- UpsertRuntimeUsage / GetRuntimeUsageSummary queries
- runtime_usage table (migration 046)

Refs: MUL-786

* fix(runtime): bucket daily usage by task_usage.created_at, not enqueue time

ListRuntimeUsage was aggregating by DATE(atq.created_at) and filtering
on atq.created_at. agent_task_queue.created_at is the enqueue timestamp,
which drifts from actual token-production time: a task queued at 23:58
and executed at 00:05 was attributed to yesterday; a task sitting in
the queue overnight was counted on the queue day.

The ?days=N cutoff also became a rolling window (now() - N) instead of
a calendar-day boundary, silently clipping the morning of the earliest
day returned.

Switch bucket + filter to task_usage.created_at (~= task completion /
usage-report time) and snap the since cutoff to start-of-day via
DATE_TRUNC.

Add a regression test covering both scenarios: cross-midnight task
attributes to the day tokens were reported, and the earliest day's
pre-cutoff rows are still included.
2026-04-16 18:54:12 +08:00
Bohan Jiang
9a97ee1f4c fix(agent): resume codex thread across tasks on the same issue (#1166)
Every other backend (Claude, Gemini, OpenCode, OpenClaw, Hermes) honors
ExecOptions.ResumeSessionID — only Codex didn't. That's why users on
the Codex runtime saw each new comment on an issue start a fresh Codex
conversation: the daemon persists Result.SessionID per (agent, issue)
and passes it back as PriorSessionID, but codex.go always called
thread/start and never populated SessionID, so the value round-tripped
as empty.

Wire the missing half:

- Extract startOrResumeThread on codexClient. When ResumeSessionID is
  set, call thread/resume (per the Codex app-server protocol), passing
  only cwd / model / developerInstructions overrides so the thread
  keeps its persisted model and reasoning effort. If resume fails
  (unknown thread, schema drift, transport error) fall back to
  thread/start so the task still runs on a fresh thread.
- Surface the live threadID as Result.SessionID on the final emit so
  the daemon stores it and feeds it back into ResumeSessionID on the
  next claim.

Tests drive the new helper through the fake stdin harness, covering:
fresh start, successful resume, fallback on resume error, fallback
when resume returns no thread ID, and surfacing of thread/start
failures.
2026-04-16 18:06:11 +08:00
LinYushen
cd50c31201 feat(agent): add GitHub Copilot CLI backend (#1157)
* feat(agent): add GitHub Copilot CLI backend

Integrate Copilot CLI as a new agent backend using the stable
`-p` JSONL mode (`--output-format json`), following the same
spawn-CLI-scan-JSONL pattern established by claude.go.

Backend (server/pkg/agent/copilot.go):
- Spawn `copilot -p <prompt> --output-format json --allow-all-tools --no-ask-user`
- Parse streaming JSONL events (system/assistant/user/result/log)
- Extract session ID for resume support (`--resume <id>`)
- Accumulate per-model token usage for billing
- Filter blocked args to prevent protocol-critical flag overrides

Daemon config:
- Probe MULTICA_COPILOT_PATH / MULTICA_COPILOT_MODEL env vars
- Copilot uses AGENTS.md (native discovery) and default skills path

Frontend:
- Add Copilot logo SVG and provider switch case

Tests: 14 unit tests covering arg building, event parsing, usage
accumulation, and edge cases. All Go + TS checks pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(daemon): add restart subcommand, make daemon uses it

- `daemon start` keeps original behavior: errors if already running
- `daemon restart` stops existing daemon then starts fresh
- `make daemon` now runs `daemon restart --profile local`

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(copilot): address review nits 1-5

- Nit 1: Add MinVersions["copilot"] = "1.0.0"
- Nit 2: Seed activeModel from session.start.data.selectedModel (falls
  back to opts.Model, then "copilot"). First-turn tokens now get correct
  model attribution.
- Nit 3: Handle assistant.reasoning/reasoning_delta → MessageThinking,
  reasoningText in assistant.message → MessageThinking,
  session.warning → MessageLog{warn}
- Nit 4: Extract handleCopilotEvent() method shared by production and
  tests — no more duplicated switch body that can drift
- Nit 5: Deltas write to output buffer as defense-in-depth; if process
  dies before assistant.message, output is non-empty

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-16 17:14:56 +08:00
Bohan Jiang
ac8b08e540 fix(agent): surface codex turn errors instead of reporting empty output (#1156)
When codex emits `turn/completed` with `status="failed"` or a terminal
top-level `error` notification, the daemon previously treated the turn
as successfully completed, saw no accumulated text, and surfaced the
generic "codex returned empty output" — hiding the real reason (auth,
sandbox, API error, etc.).

Capture `turn.error.message` on failed turns and the `error.message`
from non-retrying top-level error notifications, then propagate them
through `Result.Error` with `finalStatus="failed"` so the daemon's
default branch reports the actual cause.
2026-04-16 16:53:08 +08:00
devv-eve
c0b4e7e8b8 feat(agent): add Cursor Agent CLI runtime support (#1057)
* feat(agent): add Cursor Agent CLI runtime support

Add cursor-agent as a new agent backend, following the same pattern as
existing providers. The implementation spawns cursor-agent CLI with
stream-json output, parses JSONL events into the unified Message type,
and supports session resume, usage tracking, and auto-approval (--yolo).

Changes:
- server/pkg/agent/cursor.go: cursorBackend implementation
- server/pkg/agent/cursor_test.go: unit tests for args, parsing, errors
- server/pkg/agent/agent.go: register "cursor" in New() factory
- server/internal/daemon/config.go: probe cursor-agent in PATH
- server/internal/daemon/execenv/context.go: cursor skill discovery path
- server/internal/daemon/execenv/runtime_config.go: AGENTS.md injection
- packages/views/.../provider-logo.tsx: cursor logo in UI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(agent): address PR review for cursor backend

1. Fix token usage double-counting: usage is now taken exclusively from
   "result" events (session totals). Per-message usage in "assistant"
   events is intentionally ignored. "step_finish" usage is only used as
   fallback when no "result" usage is available.

2. Remove dead code: isCursorUnknownSessionError() and its regex were
   defined but never called. Removed along with corresponding test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(agent): add missing CustomArgs, SystemPrompt, MaxTurns, and debug logging to cursor backend

- Add cursorBlockedArgs and filterCustomArgs support for safe custom arg passthrough
- Add --system-prompt and --max-turns flag support to buildCursorArgs
- Add debug logging of command args before execution (consistent with all other backends)
- Move stdout-close goroutine inside main goroutine (consistent with claude.go pattern)
- Add tests for SystemPrompt/MaxTurns and CustomArgs filtering

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: make daemon uses local profile & update Cursor logo to official brand

- Makefile: make daemon now runs 'daemon start --profile local' for local dev
- Replace Cursor runtime logo with official brand SVG (removed background rect)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(agent): remove unsupported --system-prompt and --max-turns from cursor-agent

cursor-agent CLI does not support these flags. Instructions are already
injected via AGENTS.md and .cursor/skills/ files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(agent): prevent step_finish + result usage double-counting in cursor

Split usage accumulation into separate stepUsage and resultUsage maps.
After stream ends, use resultUsage if available (session totals from
result event), otherwise fall back to stepUsage (sum of step_finish).
This prevents 2x counting when result.usage already includes totals.

Added table-driven test covering: result-only, step_finish-only,
step_finish+result (no double count), and multi-model scenarios.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(agent): fix misleading comment on cursor -p flag

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: yushen <ldnvnbl@gmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-16 15:54:21 +08:00
Bohan Jiang
8c518c350a feat(agent): add Pi agent runtime support (#1064)
* feat(agent): add Pi agent runtime support

Add Pi as a new agent runtime provider, following the established adapter
pattern. Pi CLI outputs JSONL events which are parsed for messages, tool
calls, and usage tracking.

Backend:
- New piBackend implementing the Backend interface (pi.go)
- Pi CLI discovery via MULTICA_PI_PATH env var or PATH lookup
- JSONL event stream parsing (agent_start, message_update, thinking_update,
  tool_execution_start/end, agent_end)
- Usage scanner for ~/.pi/sessions/*.jsonl files
- Runtime config injection via AGENTS.md
- Skill injection to .pi/agent/skills/

Frontend:
- Pi provider logo (teal π icon)
- Pi label in transcript dialog

Docs:
- Updated all provider lists in README, CLI_INSTALL, and docs

* fix(agent): filter Pi usage scanner to agent_end events only

Address review feedback: restrict usage parsing to agent_end events
which contain cumulative totals, preventing potential inaccuracy if
Pi adds usage fields to other event types in the future.

* fix(agent): align Pi runtime with real CLI flags, event schema, and custom_args

- Flags: Pi's CLI uses `--mode json` (not `--output-format jsonl`), has no
  `--yolo` (explicit `--tools` allowlist instead), takes the prompt as a
  positional argument (not `-p <prompt>`), splits model as
  `--provider <name> --model <id>`, and treats `--session` as a file path
  that must exist before spawn.
- Event parsing: rewrite the stream event struct to match Pi's actual
  JSON event schema (`message_update.assistantMessageEvent.delta`,
  `turn_end.message.usage.{input,output,cacheRead,cacheWrite}`, etc.).
- Sessions: generate/persist session files under ~/.multica/pi-sessions/
  and use the file path as the opaque SessionID returned to the daemon.
- Usage scanner: read assistant `message` events from the same session
  files (Pi's session-file schema, distinct from the stdout stream).
- Custom args: consume `ExecOptions.CustomArgs` via `filterCustomArgs`
  with a Pi-specific blocked set (`-p`, `--print`, `--mode`, `--session`)
  so Pi matches the pattern shared by every other agent backend.
2026-04-16 15:42:40 +08:00
Junghwan
7395b51aee fix(agent): apply filterCustomArgs to hermes backend for parity (#1122)
Every other backend (claude, codex, opencode, openclaw, gemini) filters
opts.CustomArgs through a per-backend blocked map so protocol-critical
flags can't be overridden via the Create Agent UI. The hermes backend
appended CustomArgs directly to argv, so any future flag we add to the
map would be silently bypassed here.

Add hermesBlockedArgs (with 'acp' as the pinned subcommand) and route
CustomArgs through filterCustomArgs. Behaviour is identical for today's
use cases; the change prevents accidental protocol-flag overrides and
brings hermes in line with the other five backends.

Closes #1113

Co-authored-by: shaun0927 <shaun0927@users.noreply.github.com>
2026-04-16 13:53:53 +08:00
Bohan Jiang
b6d30c0e00 feat(agent): log full command line at debug level when spawning agents (#1071)
Add a debug-level log line in every agent backend (claude, codex,
opencode, openclaw, gemini, hermes) that prints the executable path
and full argument list when spawning the agent process. Helps diagnose
custom args, model overrides, and other CLI flag issues.
2026-04-15 16:21:55 +08:00
Bohan Jiang
ce447c7f06 feat(agent): add custom CLI arguments support (#986)
* feat(agent): add custom CLI arguments support

Allow users to configure custom CLI arguments per agent that get
appended to the agent subprocess command at launch time. This enables
use cases like specifying different models (--model o3), max turns,
or other provider-specific flags without needing separate runtimes.

Changes:
- Add custom_args JSONB column to agent table (migration 041)
- Update API handler to accept/return custom_args in create/update
- Pass custom_args through claim endpoint to daemon
- Append custom_args to CLI commands for all agent backends
- Add ExecOptions.CustomArgs field in agent package
- Add Custom Args tab in agent detail UI
- Add --custom-args flag to CLI agent create/update commands

Closes MUL-802

* fix(agent): filter protocol-critical flags from custom_args

Add per-backend filtering of custom_args to prevent users from
accidentally overriding flags that the daemon hardcodes for its
communication protocol (e.g. --output-format, --input-format,
--permission-mode for Claude).

This follows the same pattern as custom_env's isBlockedEnvKey: we
only block the small, stable set of flags that would break the
daemon↔agent protocol — not every possible dangerous flag. Workspace
members are trusted for everything else.

Each backend defines its own blocked set:
- Claude: -p, --output-format, --input-format, --permission-mode
- Gemini: -p, --yolo, -o
- Codex: --listen
- OpenCode: --format
- OpenClaw: --local, --json, --session-id, --message
- Hermes: none (ACP is positional)

Includes unit tests for the filtering logic.

* fix(agent): address code review nits for custom_args

- Replace module-level `nextArgId` counter with `crypto.randomUUID()`
  in custom-args-tab.tsx to avoid SSR ID conflicts
- Add unit tests for custom args passthrough and blocked-arg filtering
  in both Claude and Gemini arg builders
2026-04-15 14:58:53 +08:00
Jiayuan Zhang
f94b0100cd refactor(autopilot): remove broken concurrency policies and fix multiple bugs (#1048)
Remove the concurrency_policy system (skip/queue/replace) — skip had an
orphan bug that permanently blocked triggers, queue didn't actually queue,
and replace didn't cancel running tasks. Every trigger now simply executes.

Bug fixes:
- Listener now handles in_review status (was silently ignored)
- Issue deletion fails linked autopilot runs before DELETE (prevents orphans)
- ComputeNextRun rejects invalid timezones instead of silent UTC fallback
- dispatchCreateIssue post-commit failures now properly fail the run

Reliability:
- Scheduler recovers lost triggers on startup (crash recovery)
- New index on autopilot_run(issue_id) for deletion lookups
- Migration 043 cleans up historical orphaned/skipped/pending runs
2026-04-15 13:48:21 +08:00
Jiayuan Zhang
d88fe2608e feat(autopilot): scheduled/triggered automations for AI agents (#1028)
* feat(autopilot): add scheduled/triggered automation for AI agents

Introduce the Autopilot feature — recurring automations that assign work
to AI agents on a schedule or manual trigger. Supports two execution
modes: create_issue (creates an issue for the agent to work on) and
run_only (directly enqueues an agent task without issue pollution).

Backend: migration (3 tables + 2 columns), sqlc queries, AutopilotService
with concurrency policies (skip/queue/replace), HTTP CRUD + trigger
endpoints, background cron scheduler (30s tick), event listeners for
issue→run and task→run status sync.

Frontend: types, API client methods, TanStack Query hooks with optimistic
mutations, realtime cache invalidation, list page with create dialog,
detail page with trigger management and run history, sidebar nav + routes
for both web and desktop apps.

* feat(autopilot): improve UX — trigger config, edit dialog, template gallery

- Replace raw cron input with friendly frequency tabs (Hourly/Daily/Weekdays/Weekly/Custom), time picker, and timezone dropdown defaulting to user's local timezone
- Fix Select components showing UUIDs instead of names (Base UI render function pattern)
- Add Edit button on detail page opening a unified edit dialog
- Remove project/concurrency/issue-title-template from create/edit (simplify for users)
- Add trigger configuration inline during autopilot creation
- Add template gallery on empty state (6 step-by-step workflow templates)
- Rename "Description" to "Prompt" throughout UI
- Inject autopilot run timestamp into issue description for agent date awareness
- Treat issue status "in_review" as run completion (fixes skip on next trigger)
- Make migration idempotent with IF NOT EXISTS clauses
2026-04-15 04:54:37 +08:00
Bohan Jiang
ff1d348274 feat(security): invitation acceptance flow for workspace members (#1019)
* feat(security): replace instant member-add with invitation acceptance flow

Users invited to a workspace must now explicitly accept the invitation
before becoming a member. This fixes the security vulnerability where
knowing someone's email was enough to auto-register their runtime to
your workspace.

Changes:
- Add workspace_invitation table with pending/accepted/declined/expired states
- Replace CreateMember with CreateInvitation (same endpoint, new behavior)
- Add accept/decline/revoke/list invitation API endpoints
- Add invitation WS events for real-time notification
- Frontend: invitation accept/decline UI in workspace switcher
- Frontend: pending invitations section in members settings tab

* fix(invitation): address PR review nits

- Fix invitation:revoked listener to send event to invitee user (was no-op)
- Remove duplicate queryClient2 in app-sidebar.tsx, reuse existing queryClient
- Add expires_at > now() filter to ListPendingInvitationsByWorkspace query
2026-04-15 00:01:18 +08:00
Naiyuan Qing
a744cd4f45 feat(chat): redesign state, header, and unread tracking
State management
- Pending task / live timeline are now Query-cache single source;
  Zustand mirror removed (fixes duplicate assistant render caused by
  the invalidate→refetch race window)
- WS subscriptions moved from ChatWindow to global useRealtimeSync so
  pending state survives minimize and refresh
- New GET /chat/sessions/:id/pending-task to recover live state on mount
- Drafts persisted per-session (was per-workspace)

Unread tracking
- Migration 040: chat_session.unread_since (event-driven; old chats
  stay clean — no mass backfill)
- POST /chat/sessions/:id/read clears unread; broadcasts
  chat:session_read so other devices sync
- New GET /chat/pending-tasks aggregate for the FAB
- ChatFab: brand-color impulse animation while running, brand-dot
  badge of unread session count
- ChatWindow auto-marks read when user is viewing the session

Header redesign
- Two independent dropdowns: agent (avatar + name + My/Others
  grouping) at the input bottom-left; session (title + agent avatar)
  in the header
- ⊕ new-chat button replaces the old + and history buttons
- Session dropdown lists all sessions across agents with avatars
- Empty state: 3 clickable starter prompts that send immediately
- Mention link renderer falls through to default span on null —
  fixes @member/@agent/@all silently disappearing app-wide
- User messages render through Markdown
- Enter submits in chat input only (with IME guard + codeBlock skip);
  bubble menu hidden in chat

Misc
- Partial index on agent_task_queue for fast pending-task lookup
- 2 new storage keys added to clearWorkspaceStorage
- useMarkChatSessionRead has onError rollback
- chat.* namespace logs across store, mutations, components, realtime

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:21:11 +08:00
Bohan Jiang
bf71802451 fix(server): trigger agent on reply in thread where agent already participated (#981)
When a member replies in a member-started thread without @mentioning the
assigned agent, the on_comment trigger was suppressed — even if the agent
had already replied in that thread. This meant the common flow of
"member posts → agent replies → member follows up" would not re-trigger
the agent on the follow-up.

Add HasAgentRepliedInThread SQL query and check it in isReplyToMemberThread
so that agent participation in a thread is treated as an ongoing conversation.
2026-04-14 18:00:29 +08:00
Bohan Jiang
0a998d1cef Merge pull request #846 from multica-ai/agent/j/feb218fd
feat(agent): support custom environment variables for router/proxy mode
2026-04-14 15:34:21 +08:00
Jiang Bohan
bb4944bae2 fix(openclaw): handle pretty-printed multi-line JSON output
OpenClaw outputs its --json result as pretty-printed multi-line JSON to
stderr. The line-by-line scanner never found a valid JSON object on any
single line, causing the raw JSON to be returned as the chat response.

After exhausting line-by-line parsing, try parsing the accumulated
output as a whole before falling back to raw text.

Closes MUL-725
2026-04-14 14:39:32 +08:00
Jiang Bohan
f3355049bc feat(agent): add live log support for Gemini CLI via stream-json
Switch Gemini backend from `-o text` (batch output) to `-o stream-json`
(NDJSON streaming) so tool calls, text, and errors are forwarded to the
UI in real time instead of collected at the end.

Parses all Gemini stream-json event types: init, message, tool_use,
tool_result, error, and result — including per-model token usage from
the result stats.
2026-04-14 14:17:13 +08:00
Bohan Jiang
c71525e198 Merge pull request #910 from multica-ai/agent/j/openclaw-p0-p1
feat(agent): OpenClaw backend P0+P1 improvements
2026-04-14 14:02:38 +08:00
devv-eve
977dc6479d fix(daemon): prevent task stall when agent process hangs on stdout (#947)
When an agent CLI process hangs (e.g. a tool call blocks on unreachable
I/O), the daemon's scanner blocks indefinitely on stdout, preventing the
Result from ever being sent. This causes tasks to stay in "running"
state permanently with no further events.

Three-layer fix:

1. Agent backends (claude, opencode, openclaw, gemini): add a watchdog
   goroutine that closes the stdout/stderr pipe when the context is
   cancelled, forcing the scanner to unblock. Also set cmd.WaitDelay
   so Go force-closes pipes after 10s if the process doesn't exit.

2. daemon executeAndDrain: add an independent drain timeout (backend
   timeout + 30s buffer) with context-aware select on both the message
   channel and the result channel, so the daemon never blocks forever.

3. daemon ping path: add context-aware select so pings don't deadlock
   if the agent backend stalls.

Closes #925

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 23:00:27 -07:00
Sanjay Ramadugu
fe0d450471 fix(daemon): correct Gemini backend status on timeout and cancellation
Check runCtx.Err() before readErr/waitErr so that context-driven
process kills (timeout, user cancellation) report the correct status
("timeout" or "aborted") instead of "failed".

When exec.CommandContext kills the gemini process, io.ReadAll can
return a non-nil error as a side-effect of the closed pipe. The
previous code checked readErr first, masking the real cause. This
aligns gemini.go with the ordering already used in claude.go and
hermes.go.

Fixes #914
2026-04-13 16:30:28 -04:00
Jiayuan Zhang
bc1185f525 Merge pull request #755 from sanjay3290/feat/gemini-backend
feat(daemon): add Google Gemini CLI backend
2026-04-14 02:46:20 +08:00
Jiayuan Zhang
ff5f6ac2ee fix(daemon): prevent duplicate runtime registration on profile switch (#906)
* fix(daemon): prevent duplicate runtime registration on profile switch

The daemon_id included a profile name suffix (e.g. "hostname-staging"),
so switching profiles created a new daemon_id that bypassed the UPSERT
dedup constraint, leaving orphaned runtime records in the database.

Three changes:
- Remove profile suffix from daemon_id — use stable hostname only.
  The unique constraint (workspace_id, daemon_id, provider) already
  prevents collisions within the same workspace.
- Auto-migrate agents from old offline runtimes to the newly registered
  runtime during DaemonRegister (same workspace/provider/owner).
- Add TTL-based GC in the runtime sweeper to delete offline runtimes
  with no active agents after 7 days.

Closes MUL-695

* fix(daemon): address code review issues on PR #906

1. Move gcRuntimes() to the main sweep loop — previously it was inside
   sweepStaleRuntimes() after an early return, so it only ran when new
   runtimes were marked stale. Now it runs every sweep cycle independently.

2. Fix DeleteStaleOfflineRuntimes to exclude runtimes with ANY agent
   reference (not just active ones). The FK agent.runtime_id is ON DELETE
   RESTRICT, so archived agents also block deletion.

3. Scope MigrateAgentsToRuntime to the same machine by matching
   daemon_id LIKE '<current_daemon_id>-%'. This prevents cross-machine
   agent migration when the same user has multiple devices.
2026-04-14 01:52:34 +08:00
Jiang Bohan
a0d43ca31a feat(agent): OpenClaw backend P0+P1 improvements
Combined P0 and P1 improvements to the OpenClaw agent backend, informed
by PaperClip's adapter architecture:

P0 — User experience:
- Streaming output — emit MessageText as NDJSON events arrive in real
  time, instead of waiting for the final result blob
- Tool use support — parse and emit MessageToolUse/MessageToolResult
  from streaming events, matching Claude and OpenCode backends
- Model & system prompt — pass --model and --system-prompt to the
  OpenClaw CLI when configured

P1 — Robustness:
- Hardened JSON parsing — tryParseOpenclawResult requires lines to
  start with '{', eliminating fragile brace-scanning that could
  false-match JSON fragments in log lines
- Lifecycle event handling — new "lifecycle" event type with phase
  tracking (error/failed/cancelled), plus structured error objects
  (error.name, error.data.message) matching PaperClip's pattern
- Usage field name variants — parseOpenclawUsage supports multiple
  naming conventions (input/inputTokens/input_tokens, cacheRead/
  cachedInputTokens/cache_read_input_tokens, etc.) with incremental
  accumulation across step_finish events

Backwards compatible with the legacy single JSON blob format.
31 tests covering all new functionality.

Closes MUL-726
2026-04-14 01:52:03 +08:00
Jiang Bohan
bf5395f9ee Revert "fix: handle control_request messages in claude backend (auto-approve was dead code) (#811)"
This reverts commit 4d31b1ecee.
2026-04-13 23:11:36 +08:00
james
e01fa6bd9e fix(agent): prevent Claude runtime pings from hanging after the model has already finished
Claude's stream-json flow can emit the terminal result event while the
child process still waits on open stdin. Closing stdin as soon as the
final result arrives lets the CLI exit cleanly instead of idling until
the daemon timeout fires.

Constraint: Must preserve the existing Claude stream-json protocol and child-process lifecycle
Rejected: Increase ping timeout only | masks the hang without fixing process exit
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep Claude stdin handling aligned with the stream-json terminal result semantics; do not defer closure until goroutine teardown
Tested: Reproduced self-hosted runtime ping timeout locally; verified ping succeeds after closing stdin on result; cd server && go test ./pkg/agent
Not-tested: Full make check; Bedrock/Vertex-specific Claude auth flows
2026-04-13 08:59:34 -05:00
leaderlemon
5cd58183b2 fix(openclaw): handle JSON results with durationMs but no payloads (#862)
Some OpenClaw JSON outputs contain durationMs but lack payloads field.
The original condition rejected these results, causing the agent to
return "openclaw returned no parseable output" instead of the actual
execution result.

Fix by accepting results that have either payloads OR durationMs > 0.

Fixes #830

Co-authored-by: leaderlemon <leaderlemon@users.noreply.github.com>
2026-04-13 19:41:47 +08:00
Bohan Jiang
a94c6481dd fix: compute sub-issue progress from database instead of paginated client cache (#865)
The sub-issue progress indicator (e.g. "0/2") was undercounting because
it was computed from the client-side issue list, which only loads the
first 50 done issues. Sub-issues marked as done beyond that page were
excluded from both the total and done counts.

Added a dedicated backend endpoint (GET /api/issues/child-progress) that
aggregates child issue counts directly from the database, ensuring
accurate totals regardless of client-side pagination or filtering.

Fixes MUL-702
2026-04-13 19:10:28 +08:00
Jiang Bohan
4165401d16 feat(agent): support custom environment variables for router/proxy mode
Add per-agent custom_env configuration that gets injected into the agent
subprocess at launch time. This enables users to configure custom API
endpoints (ANTHROPIC_BASE_URL), API keys (ANTHROPIC_API_KEY), and cloud
provider modes (CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) without
requiring code changes.

Changes:
- Migration 040: add custom_env JSONB column to agent table
- Backend: custom_env in agent CRUD API + claim endpoint
- Daemon: merge custom_env into subprocess environment variables
- Frontend: env var editor in agent settings (key-value pairs with
  visibility toggle for sensitive values)

Closes #816
Related: #807, #809
2026-04-13 16:47:56 +08:00
Cocoon-Break
4d31b1ecee fix: handle control_request messages in claude backend (auto-approve was dead code) (#811)
* fix: handle control_request messages in claude backend to enable auto-approve (Closes #810)

Signed-off-by: cocoon <54054995+kuishou68@users.noreply.github.com>

* fix: defer stdin.Close() inside goroutine so control_request writes can succeed

Signed-off-by: Cocoon-Break <54054995+kuishou68@users.noreply.github.com>

---------

Signed-off-by: cocoon <54054995+kuishou68@users.noreply.github.com>
Signed-off-by: Cocoon-Break <54054995+kuishou68@users.noreply.github.com>
2026-04-13 15:47:28 +08:00
Roshan Warrier
e044c7e84b fix(agent): parse openclaw result incrementally (#836)
Co-authored-by: txhno <198242577+txhno@users.noreply.github.com>
2026-04-13 15:29:34 +08:00
Jiayuan Zhang
a3eefcf2c4 Revert "feat: add online status indicator on agent & member avatars (#821)" (#837)
This reverts commit 1d64ea4ba6.
2026-04-13 15:03:31 +08:00
Jiayuan Zhang
1d64ea4ba6 feat: add online status indicator on agent & member avatars (#821)
* feat: add online status indicator dot on agent & member avatars

Backend:
- Track member presence via WebSocket connections in the Hub
- Broadcast member:online/offline events when users connect/disconnect
- Add GET /api/workspaces/{id}/members/online endpoint
- Add member:online and member:offline event type constants

Frontend:
- Add isOnline prop to ActorAvatar with a status dot at top-right corner
- Green dot = online, gray dot = offline, no dot = status unknown
- Fetch online member list via new query, update optimistically on WS events
- Derive agent online status from existing agent.status field
- Wire online status through ActorAvatar views wrapper (enabled by default)

* fix: address code review — fix hub tests and avatar rounding

1. Hub tests: consume the member:online presence event from the first
   connection before asserting on broadcast messages.
2. ActorAvatar: use rounded-[inherit] on the inner wrapper so callers
   can override rounding (e.g. rounded-lg for agent list items).

* fix: consume member:online presence event in integration test

Same fix as the hub unit tests — read and discard the member:online
event before asserting on issue:created in TestWebSocketIntegration.
2026-04-13 14:46:34 +08:00
bulai0408
47eb6cb612 fix(agent): enable network access for Codex sandbox so Multica CLI can reach API
Codex tasks running in workspace-write sandbox mode could not resolve
api.multica.ai because the hardcoded sandbox parameter in thread/start
overrode any config.toml settings, and the default sandbox policy blocks
network access.

Changes:
- Remove hardcoded `sandbox: "workspace-write"` from thread/start RPC —
  let Codex read sandbox config from its own config.toml instead
- Auto-generate config.toml in per-task CODEX_HOME with
  `sandbox_mode = "workspace-write"` and `network_access = true`,
  preserving any existing user settings
- Fix Reuse() to restore CodexHome for Codex provider on workdir reuse

Closes #368
2026-04-13 01:03:43 +08:00
Jiang Bohan
7312b5650c fix(server): fix ListRuntimeUsage to filter by date range instead of row count (#765)
Replace LIMIT $2 with AND date >= $2 in ListRuntimeUsage query. When a
runtime uses multiple models each day has multiple rows, so a row LIMIT
silently returns fewer days than requested.

Also fixes displayName warnings in issue-detail test mocks and adds
missing setOpen to useCallback deps in search-command.

Co-authored-by: jayavibhavnk <jaya11vibhav@gmail.com>
Closes #731
2026-04-12 22:46:07 +08:00
Sanjay Ramadugu
f99f50eb0c feat(daemon): add Google Gemini CLI backend
Registers `gemini` as a sixth supported agent provider alongside claude,
codex, opencode, openclaw, and hermes.

- Daemon config probes for `gemini` on PATH (MULTICA_GEMINI_PATH /
  MULTICA_GEMINI_MODEL env overrides mirror the other providers).
- New agent.geminiBackend in pkg/agent/gemini.go: spawns
  `gemini -p <prompt> --yolo -o text [-m <model>] [-r <session>]`,
  reads stdout to completion, and returns a single MessageText plus
  the standard Result struct (Status / Output / DurationMs).
- Execution environment writes a GEMINI.md file into the task workdir
  (mirroring the existing CLAUDE.md / AGENTS.md injection for other
  providers) so Gemini discovers the Multica runtime meta-skill
  through its native mechanism.

Tests:

- pkg/agent/gemini_test.go — unit coverage for buildGeminiArgs
  (baseline, model override, resume session, omit-when-empty).
- internal/daemon/execenv/TestInjectRuntimeConfigGemini — verifies
  GEMINI.md is written and that CLAUDE.md/AGENTS.md are NOT.

Scope (intentional for v1):

- Text output only (`-o text`). Streaming tool events via
  `--output-format stream-json` is a follow-up once we have a
  reliable reproduction of Gemini's event schema.
- No MCP config plumbing. Gemini's `--allowed-mcp-server-names`
  filter pairs well with the per-agent MCP work on feat/per-agent-mcp;
  stacking the two can land as a follow-up.
- No token usage scraping (Gemini's accounting lives on the Google
  Cloud side, not a local JSONL log like claude/codex).
- No session resume wiring beyond accepting the ExecOptions field —
  the daemon does not yet persist Gemini session IDs because the text
  output mode does not expose them.

Migration / env changes:

- New optional environment variables MULTICA_GEMINI_PATH and
  MULTICA_GEMINI_MODEL. Default path is the string "gemini" (resolved
  via PATH at daemon startup). If no Gemini install is detected, the
  provider is simply absent from the runtime — no behavior change for
  existing deployments.
2026-04-11 22:58:49 -04:00
Bohan Jiang
d11824807a fix(agent): handle braces in stderr log lines before openclaw JSON result (#718)
processOutput() used strings.Index(raw, "{") to find the JSON start,
but error lines like `raw_params={"command":"..."}` contain braces that
get matched first, causing JSON parsing to fail and the entire raw
stderr (including internal metadata) to be returned as the agent comment.

Now tries each '{' position until one successfully unmarshals as a valid
openclawResult, skipping braces embedded in log/error lines.
2026-04-11 23:07:58 +08:00
Bohan Jiang
e477d64548 fix(cli): poll health endpoint instead of fixed sleep in daemon start (#716)
* fix(cli): poll health endpoint instead of fixed sleep in daemon start

The daemon start command waited a fixed 2 seconds then checked the
health endpoint once. If the daemon took longer to initialize (auth,
workspace loading), the check failed and printed a misleading error
even though the daemon started successfully.

Replace the single check with a polling loop (500ms interval, 15s
timeout) so the CLI waits for the daemon to actually be ready.

* fix(agent): rewrite openclaw tests to match new backend API

The openclaw backend was rewritten in #715 to parse a single JSON blob
instead of streaming NDJSON events. The tests still referenced the old
types (openclawEvent) and methods (handleOCTextEvent, etc.), causing a
build failure in CI.

Rewrite all tests to exercise the new processOutput method and
openclawInt64 helper.
2026-04-11 22:25:19 +08:00
Bohan Jiang
2e33084097 fix(agent): rewrite openclaw backend to match actual CLI interface (#715)
* fix(agent): use --message flag for OpenClaw CLI invocation

OpenClaw CLI changed its prompt flag from `-p` to `--message`. The old
flag caused tasks to fail immediately with "required option '-m,
--message <text>' not specified".

Fixes #713, relates to #703.

* fix(agent): rewrite openclaw backend to match actual CLI interface

- Replace unsupported flags (-p, --output-format, --yes) with correct
  ones (--message, --json, --local, --session-id)
- Read JSON result from stderr (where openclaw writes it)
- Parse openclaw's actual output format ({payloads, meta})
- Auto-generate session ID for each task execution
- Show "live log not available" hint in agent live card when timeline
  is empty (openclaw doesn't support streaming)
2026-04-11 22:14:47 +08:00
Jiayuan Zhang
b3f98ef95d fix(server): skip auto-comment when agent already posted during task (#712)
* fix(server): skip auto-comment when agent already posted during task

In CompleteTask(), check if the agent already posted a comment on the
issue since the task started. If so, skip the automatic output comment
to avoid duplicates. This preserves the fallback for agents that don't
post comments via CLI.

Closes MUL-609

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(server): use StartedAt instead of CreatedAt for duplicate check

CreatedAt is the enqueue time, not execution start. If a previous task
posted a comment between enqueue and start of the next task, it would
incorrectly suppress the auto-comment for the later task.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:27:02 +08:00
Zheng Li
a76194744a feat(cli): add --project filter to issue list (#691)
Co-authored-by: nocoo <nocoo@users.noreply.github.com>
2026-04-11 14:37:24 +08:00