* fix(agent/codex): surface stderr tail in initialize / turn startup errors
When codex app-server exits before the JSON-RPC handshake completes —
e.g. because the user put a flag in custom_args that the subcommand
rejects — the Result.Error users see is `codex initialize failed:
codex process exited`, while codex's actual complaint (typically
something like `error: unexpected argument '-m' found`) only lives in
daemon logs.
Wrap the stderr writer with a bounded stderrTail that still forwards
to the slog logWriter but also retains the last 2 KiB of bytes
written. Include that tail on the three startup failure paths
(initialize, startOrResumeThread, turn/start). Runtime cancellation
paths are left untouched — they're our own abort and the stderr
context isn't a clear signal there.
Refs #1308. Complement to #1310 / #1312 — lets "bad custom_args fail
loudly" actually be workable by giving the failure a real message.
* fix(agent/codex): join cmd.Wait() before sampling stderr tail
Addressing review of #1314: reading stderrBuf.Tail() right after
c.request returns "codex process exited" was racy. Nothing in that
path synchronizes with os/exec's internal stderr copy goroutine —
cmd.Wait() is the only documented join point. The original defer ran
cmd.Wait() later, but by then we had already built Result.Error from
a potentially-empty Tail().
Replace the ad-hoc deferred stdin.Close()/cmd.Wait() with a
sync.Once-wrapped drainAndWait closure. Call it explicitly on the
three startup failure paths before sampling the tail; keep it as the
cleanup defer so the success path behaves identically.
Also add TestCodexExecuteSurfacesStderrWhenChildExitsEarly: spawns a
real subprocess that prints to stderr and exits before responding to
initialize, runs it through Execute, and asserts Result.Error
contains the stderr hint. This covers the full timing path the
reviewer flagged, which the helper-level tests in this PR did not.
Follow-up to PR #1188 / migration 047, which intentionally omitted the
five historical conflict slugs (admin / multica / new / setup / www) from
the reserved-slug audit because each had one production workspace using
it at the time and we did not want to block deploy on owner outreach.
MUL-972 closed that loop on prd for four of the five:
* admin (99cd10e4-…) → renamed to legacy-admin-99cd10e4
* multica (dcd796aa-…) → renamed to legacy-multica-dcd796aa
* new (e391e3ed-…) → renamed to legacy-new-e391e3ed
* www (5e8d38b2-…) → workspace deleted (was empty: 0 issues /
projects / agents, owner-only member; 18
workspace-FK relations all CASCADE)
This PR:
1. Adds migration 049_audit_legacy_reserved_slugs which audits those
four slugs against workspace.slug at startup. If any future workspace
slips in with one of them, startup fails loudly via RAISE EXCEPTION
instead of being silently shadowed by a global route. Mirrors the
structure of 047.
2. Adds 'multica' / 'www' / 'new' to the reserved-slug allow-deny list
in both the Go handler and the shared TS list (admin was already in
both). Keeps the two lists in lockstep per the convention enforced
in workspace_reserved_slugs.go header.
setup is STILL exempt from the audit and is intentionally NOT added to
the reserved list. The setup workspace (b43f0bc2-…) is a real production
user (owner: Roberto Betancourth, building a chants/Alabanzas app) and
is being handled out-of-band via owner outreach. A separate follow-up
migration will fold setup into the audit once that workspace's slug has
been migrated.
Migration is intentionally shipped AFTER the prd data fix (not before):
049 will RAISE EXCEPTION on any remaining conflict, so we want the data
state clean first. Rollout order:
prd data fix (done by db-boy on 2026-04-20) → this PR.
Tested:
- go test ./server/internal/handler/ -run TestReserved → pass
- pnpm --filter @multica/core test consistency → pass (4/4 in
consistency.test.ts; global-prefix↔reserved invariant holds)
Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes#1332.
Two regressions introduced in #910 (2026-04-14, "OpenClaw backend P0+P1
improvements") that together block all openclaw users:
1. `openclaw agent` does not accept `--model` or `--system-prompt`, so
any agent configured with a Model field crashed in ~700ms with
`exit status 1`. Remove both forwards, and add them to
openclawBlockedArgs so custom_args can't reintroduce the crash.
Model is bound at registration time via `openclaw agents
add/update --model`.
2. AgentInstructions were written to `{workDir}/AGENTS.md` by
execenv.InjectRuntimeConfig, but openclaw loads bootstrap files
from its own workspace dir — the file was never read, so every
agent's Instructions field was silently discarded. Populate
opts.SystemPrompt for the openclaw provider in runTask and
prepend it to the `--message` payload in the backend so the
model actually receives the instructions.
Other providers surface instructions through their native runtime
config file (CLAUDE.md / AGENTS.md / GEMINI.md) and are intentionally
left unchanged to avoid double injection.
Extract buildOpenclawArgs so arg construction is directly testable;
add unit tests covering the removed flags, the SystemPrompt prepend,
and custom_args filtering.
* fix(chat): preserve chat session resume pointer across failures
The chat 'forgets earlier messages' bug came from PriorSessionID being
silently lost in several edge cases:
- UpdateChatSessionSession unconditionally overwrote chat_session.session_id,
so any task that completed without a session_id (early agent crash,
missing result) wiped the resume pointer to NULL.
- CompleteAgentTask + UpdateChatSessionSession ran in separate calls. A
follow-up chat message claimed in between resumed against a stale (or
NULL) session and started over.
- FailAgentTask never wrote session_id back, so a task that established
a real session before failing lost its resume pointer.
- ClaimTaskByRuntime only trusted chat_session.session_id and never
fell back to the existing GetLastChatTaskSession query, so a single
bad turn could permanently drop the conversation memory.
This change:
- Use COALESCE in UpdateChatSessionSession so empty inputs preserve the
existing pointer; surface DB errors instead of swallowing them.
- Run CompleteAgentTask/FailAgentTask + UpdateChatSessionSession inside
the same transaction (TaskService now takes a TxStarter).
- Extend FailAgentTask + the daemon FailTask path (client, handler,
service) to forward session_id/work_dir, so failed/blocked tasks that
built a real session still record it.
- Fall back to GetLastChatTaskSession in ClaimTaskByRuntime when the
chat_session pointer is missing, and include failed tasks in that
lookup so a single failure can't lose the conversation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(daemon): forward session_id/work_dir on blocked + timeout paths
runTask previously dropped result.SessionID and env.WorkDir on the
non-completed return paths:
- timeout returned a naked error, so handleTask called FailTask with
empty session info and the chat resume pointer was either left stale
or eventually overwritten with NULL.
- blocked / failed (default branch) returned a TaskResult without
SessionID / WorkDir, so even though FailTask now COALESCEs into
chat_session, there was no value to write through.
- the empty-output completion path was the same: it raised an error
even when a real session_id had been built.
All three paths now return a TaskResult that carries the SessionID /
WorkDir the backend produced. Combined with the COALESCE-based update
in UpdateChatSessionSession and the FailTask plumbing introduced in
PR #1360, the next chat turn can always resume from the latest agent
session — even when the previous turn timed out, was rate-limited, or
returned an empty completion — instead of starting over with no memory
of the conversation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(copilot): capture session id from session.start as fallback
The Copilot backend only read sessionId from the synthetic 'result'
event, ignoring the one already present on session.start. When the CLI
was killed before result arrived (timeout, cancel, crash, or a
session.error mid-turn), the daemon reported SessionID="" and the
chat-session resume pointer could not advance — causing the chat to
silently drop conversation memory on the next turn.
Capture session.start.sessionId into state up front, and only let
'result' overwrite it when it actually carries one. result still wins
when present (it is the authoritative end-of-turn record).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(copilot): parse premiumRequests as float to preserve session id
Copilot CLI v1.0.32 serializes premiumRequests as a float (e.g. 7.5),
not an integer. Our copilotResultUsage struct typed it as int, which
made the entire 'result' line fail json.Unmarshal — silently dropping
sessionId on every turn.
This was the real cause of chat memory loss: the daemon reported
SessionID="" to the server, chat_session.session_id stayed NULL, and
the next chat turn never received --resume <id>, so each turn started
a fresh Copilot session with no prior context.
Add a regression test using the real JSON line from CLI v1.0.32 that
asserts sessionId is preserved when premiumRequests is fractional.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: yushen <ldnvnbl@gmail.com>
Closes#930
- Added environment variables to control signups
- Updated frontend to hide signup text when disabled
- Added backend check to block new user creation via magic link
- Updated .env.example
* feat(agent): add LaunchHeader per agent type
Each backend in server/pkg/agent/ hardcodes a stable command skeleton
(e.g. `codex app-server --listen stdio://`, `hermes acp`) before
appending opts.CustomArgs. Surfacing that skeleton lets the UI tell
users which command their custom_args are being appended to, so a
Codex user doesn't mistakenly add `-m gpt-5.4-mini` expecting it to
reach the CLI when the subcommand is actually `app-server`.
Expose only the minimum that aids judgment — binary + subcommand, or a
short mode label when there is no subcommand — and deliberately omit
transport values, internal flags, and env to keep the surface small
and renaming-safe.
Refs #1308.
* feat(handler/runtime): surface launch_header on runtime response
runtimeToResponse now derives launch_header from agent.LaunchHeader,
piggybacking on the runtime's existing provider field so the
frontend's RuntimeDevice gains the skeleton without a new endpoint or
DB query. Client gets the header for free whenever it lists agents'
runtimes — which the custom-args tab already does.
Refs #1308.
* feat(ui/agents): show launch mode preview in custom args tab
Thread the resolved RuntimeDevice from AgentDetail into CustomArgsTab
and render its launch_header as a one-line preview above the args
list, so users see `codex app-server <your args>` (or equivalent per
provider) and can tell whether a CLI-style flag like `--model` will
actually reach the invoked subcommand. Source of truth stays in the
Go backend; the TS type just carries the string.
Refs #1308.
* fix(comment): assignee on_comment path should use reply id, not thread root
Symmetric fix to #871 — that PR fixed the @mention path but missed the
assignee on_comment path in the same file. Replies on agent-assigned
issues were still getting trigger_comment_id = parent_id, so the daemon
fed the parent comment's content to the resumed claude session, which
then either exited with 'Already replied to comment <parent>' or silently
misrouted its answer depending on model / session state.
Reply placement (flat-thread grouping) is already decoupled from
trigger_comment_id by TaskService.createAgentComment's parent
normalization (added alongside #871), so passing comment.ID directly is
safe and matches the mention path's post-#871 behavior.
Fixes#1301
Made-with: Cursor
* test(comment): assert assignee on_comment records reply id as trigger_comment_id
Integration regression guard for #1301. Asserts that after a member posts
a reply under an agent-authored thread, the enqueued agent task's
trigger_comment_id matches the new reply, not the thread root. Without
the companion fix in comment.go the old parent-override would store the
root id and the daemon would feed stale content (via prompt.go
BuildPrompt) to the agent.
Made-with: Cursor
---------
Co-authored-by: fuxiao <fuxiao@zyql.com>
Agent mentions enqueue a new task; member mentions send a notification.
Without this warning, agents have used `[@Name](mention://agent/<id>)` in
prose (e.g. "GPT-Boy is correct") and accidentally re-triggered the agent.
Adds a caveat under `## Mentions` in the prompt injected into agent
runtimes, plus tightens the Agent bullet to make the side-effect explicit.
When --resume targets a dead session, claude prints
"No conversation found with session ID: ..." to stderr, emits a stream-json
system init with a fresh session_id, then exits with code 1. The backend
was treating that fresh id as the authoritative session, so
daemon.go's retry-with-fresh-session fallback (SessionID == "" guard)
never triggered. Every subsequent task for the same (issue, agent) pair
stayed permanently broken until the server-side session_id was cleared by
hand.
Fix: when --resume was requested but the emitted session_id differs AND
the run failed, drop the fresh id from Result so the daemon's existing
fallback can do its job. Factored into a pure helper and unit-tested.
Fixes#1284
Co-authored-by: fuxiao <fuxiao@zyql.com>
* fix(agent): add per-agent mcp_config field to restore MCP access
Closes#1111
The --strict-mcp-config flag was added defensively in #592 to prevent
Claude agents from inheriting MCP state from the outer Claude Code session.
It was meant to be paired with --mcp-config <path> to inject a controlled
set of MCPs, but that path was never implemented, which silently stripped
all user-scope MCPs from spawned agents.
This PR completes the original design by:
- Adding a nullable mcp_config jsonb column to the agents table
- Wiring mcp_config through AgentResponse, Create/Update requests
- Piping it into ExecOptions.McpConfig in the daemon
- Serializing to a temp file and passing --mcp-config <path> in buildClaudeArgs
- Blocklisting --mcp-config in claudeBlockedArgs to prevent override
via custom_args
Does not touch Codex provider (tracked separately in #674).
Does not implement Multica MCP auto-injection (out of scope).
* fix: disambiguate JSON null vs absent for mcp_config
GitHub Copilot CLI scans project-level skills from .github/skills/<name>/SKILL.md
(per the official cli-config-dir-reference docs), not from .agent_context/skills/.
Previously, skills injected for the copilot provider were placed under
.agent_context/skills/ and only referenced by name in AGENTS.md, meaning
Copilot would not actually pick them up.
- resolveSkillsDir: add a dedicated copilot case writing to .github/skills/
- Update doc comments in context.go and runtime_config.go
- Add TestWriteContextFilesCopilotNativeSkills covering the new path and
ensuring .agent_context/skills/ is not created for copilot
Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat(cli): add `issue subscriber` commands
Wrap the existing /subscribers, /subscribe, and /unsubscribe endpoints as
`multica issue subscriber list|add|remove`, mirroring the comment subcommand
shape. `--user <name>` reuses resolveAssignee to resolve a member or agent;
without the flag, the action targets the caller.
* fix(issues): default subscribe target to resolveActor, not X-User-ID
When no user_id is posted, subscribe/unsubscribe hardcoded the target as
("member", X-User-ID). A CLI caller running as an agent (X-Agent-ID set)
then subscribed the underlying member rather than the agent itself,
which contradicts the "defaults to the caller" contract.
Derive the default via resolveActor so the endpoint mirrors caller
identity consistently — agent caller → agent row, member caller →
member row. Adds a regression test covering the agent caller path.
Before this PR, `EnsureDaemonID(profile)` wrote to ~/.multica/profiles/
<profile>/daemon.id — meaning the same physical machine minted a different
UUID per profile. On any host running both the CLI-spawned daemon (default
profile) and the desktop-spawned daemon (profile derived from API host),
that produced two runtime rows per provider per workspace. The server-side
`legacy_daemon_ids` merge only covers hostname variants, not UUIDs, so the
rows just piled up.
Profile boundaries are about which backend/account the daemon is talking
to, not about the physical machine. Identity should be per-machine, token
should be per-profile.
Changes:
- `EnsureDaemonID` now always reads/writes ~/.multica/daemon.id regardless
of the `profile` argument. The argument is retained for migration-only
use (see promotion below).
- Migration path: when the canonical file is missing and the requested
profile has a pre-change per-profile daemon.id, promote that UUID in
place so a user who only ever ran under a named profile keeps the same
identity instead of minting a fresh UUID and round-tripping a merge.
- New `LegacyDaemonUUIDs()` scans ~/.multica/profiles/*/daemon.id and
returns every UUID that survives parsing. `config.go` now appends those
to the daemon's `legacy_daemon_ids` payload, so any runtime rows
previously registered under a per-profile UUID (on any backend) get
merged into the canonical machine UUID at register time.
Tests replace the `ProfileIsolated` assertion with `SharedAcrossProfiles`
and add coverage for promotion, UUID scanning (including skipping corrupt
files), and the empty-profiles-dir fast path.
* feat(daemon): persistent UUID identity + legacy-id merge at register-time
daemon_id is now a stable UUID persisted to `<profile-dir>/daemon.id` on
first start, replacing the hostname-derived id that drifted whenever
`.local` appeared/disappeared, a system was renamed, or a profile
switched — each of which used to mint a fresh `agent_runtime` row and
strand agents on the old one.
To migrate existing installs without operator intervention, the daemon
reports every legacy id it may have registered under previously
(`host`, `host` with `.local` stripped, and `host[-profile]` variants
for both). At register-time the server looks up each candidate row
scoped to (workspace, provider), re-points its agents and tasks onto
the new UUID-keyed row, records which legacy id was subsumed in the
new `legacy_daemon_id` column for audit, and deletes the stale row.
Result: users running `xxx.local`-keyed runtimes today transparently
land on the new UUID row on next daemon restart.
The hostname-prefix `MigrateAgentsToRuntime` / `daemon_id LIKE '...-%'`
compatibility shim is no longer needed and has been removed along with
the handler call that invoked it.
* fix(daemon): handle bidirectional .local drift and case drift in legacy merge
Review on #1220 flagged two gaps in the legacy-id migration candidate set:
1. Reverse .local: LegacyDaemonIDs only added the stripped variant when the
current hostname ended in `.local`. The opposite direction — DB has
`foo.local`, current host is `foo` — was missed, so runtimes registered
under the `.local` variant stayed orphaned after upgrade. Now both
variants (`foo` and `foo.local`) are always emitted, regardless of what
`os.Hostname()` currently returns, plus their `-<profile>` suffix forms.
2. Case drift: os.Hostname() has been observed returning different casings
on the same machine across mDNS/reboot state. A case-sensitive `=`
comparison stranded rows like `Jiayuans-MacBook-Pro.local` when the
daemon later reported `jiayuans-macbook-pro.local`. FindLegacyRuntimeByDaemonID
now uses `LOWER(daemon_id) = LOWER(@daemon_id)` on both sides, so casing
differences merge rather than orphan. The (workspace_id, provider) prefix
still bounds the scan to a tiny set of rows so the non-indexed LOWER()
comparison has negligible cost.
Tests: TestLegacyDaemonIDs gets the mixed-case + reverse-direction cases;
daemon_test.go adds TestDaemonRegister_MergesLegacyDaemonIDRuntime_ReverseDotLocal
and TestDaemonRegister_MergesLegacyDaemonIDRuntime_CaseDrift.
* fix(daemon): consolidate every case-duplicate legacy runtime, not just the first
Follow-up review on #1220: after switching to `LOWER(daemon_id) =
LOWER(@daemon_id)`, the single-row lookup still only merged one legacy
row per candidate. If a machine already had two rows in the DB that
differed only in casing (e.g. `Jiayuans-MacBook-Pro.local` AND
`jiayuans-macbook-pro.local` coexisting because earlier hostname drift
already minted a duplicate), only one of them got consolidated and the
other stayed orphaned — violating the "no duplicate runtime per machine
after backfill" acceptance.
- FindLegacyRuntimeByDaemonID → FindLegacyRuntimesByDaemonID (:many)
- mergeLegacyRuntimes iterates every returned row and dedupes across
overlapping legacy candidates so `foo` and `foo.local` both resolving
to the same stored row don't double-process
Test: TestDaemonRegister_MergesAllCaseDuplicateLegacyRuntimes seeds two
case-duplicate rows with one agent each and confirms both rows are
deleted and both agents end up on the new UUID-keyed row.
These trigger kinds exist in the DB schema but nothing on the server
fires them:
- autopilot_scheduler.ClaimDueScheduleTriggers filters kind='schedule'
(pkg/db/queries/autopilot.sql:150)
- DispatchAutopilot is reached only from the scheduler (source:schedule)
or POST /api/autopilots/{id}/trigger (source:manual); no inbound
webhook or api endpoint exists
- The UI only surfaces schedule creation
Exposing them in the CLI lets users create triggers that sit in the DB
doing nothing. Drop --kind from trigger-add, require --cron, always
send kind=schedule. Re-add the flag when the server grows a dispatch
path for the other kinds.
Follow-up to #1249. Two small follow-ups requested in review:
1. `resolveTaskWorkspaceID` was duplicated between `handler/daemon.go` and
`service/task.go`. #1249 fixed the handler copy but left both in place,
meaning any future branch (e.g. a fourth task link type) still needs
to be added in two files. Promote the service method to the exported
`TaskService.ResolveTaskWorkspaceID` and delete the handler copy.
Handler's `requireDaemonTaskAccess` and `ListTaskMessagesByUser` now
call through `h.TaskService`.
2. Add a regression test `TestStartTask_AutopilotRunOnlyTask_ResolvesWorkspace`
covering the exact scenario from #1224: a task linked only via
`AutopilotRunID` must resolve to the autopilot's workspace. The test
asserts 404 for a cross-workspace daemon token and 200 (with status
transitioning to `running`) for the correct-workspace token.
Follow-up to #1192. Document the v2 protocol contract that the
dispatch-level threadId guard relies on, and lock down the two leakage
paths the guard closes:
- turn/completed from a subagent thread must not call onTurnDone
- item/completed (agentMessage, final_answer) from a subagent thread
must neither leak text into the output builder nor terminate the turn
Without these tests a future refactor that drops or relocates the guard
would not be caught by CI, since existing notification tests omit the
top-level threadId field and pass through unfiltered.
* feat(cli): add autopilot commands
Expose the existing autopilot REST API through the multica CLI so
users and agents can list, get, create, update, delete, trigger, and
inspect autopilots, plus manage their triggers (schedule/webhook/api).
Also surface the read + core write commands in the agent meta skill
prompt so agents discover them without needing --help.
- new cmd_autopilot.go (+ test) wiring /api/autopilots endpoints
- add APIClient.PatchJSON (autopilot update uses PATCH)
- expose autopilot in CORE COMMANDS group
- extend runtime_config.go meta skill with autopilot entries
- document autopilot command group in CLI_AND_DAEMON.md
* fix(autopilot): address code review — restrict run_only, validate workspace on update
Code review caught two issues with the initial CLI PR:
1. run_only mode is broken end-to-end. The daemon-side
resolveTaskWorkspaceID() in internal/handler/daemon.go only resolves
workspace from issue/chat, so run_only tasks (which have neither)
return 404 from /start. BuildPrompt() would also emit an empty issue
ID. The service-level resolver in internal/service/task.go already
handles AutopilotRunID, but the daemon endpoint uses the handler
copy. Fixing that path is out of scope for the CLI PR; drop
run_only from the CLI and docs so we don't recommend a mode that
cannot complete. Server continues to accept it for the existing UI.
2. UpdateAutopilot did not verify that a new assignee_id belongs to
the workspace, unlike CreateAutopilot. This let a PATCH swap in an
agent from a different workspace. Mirror the same
GetAgentInWorkspace check.
* fix(daemon): filter thread/status/changed by threadId to prevent subagent interference
When Codex CLI has memories enabled, the app-server spawns a memory
consolidation subagent as a separate thread within the same stdio
connection. When that subagent thread finishes and transitions to idle,
the daemon's codex backend mistakenly interprets the idle signal as the
main turn completing, causing it to close stdin and cancel the context
before the real turn produces any output.
Add a threadId check to the thread/status/changed handler so only
status changes from the tracked thread trigger turn completion. Signals
from subagent threads (threadId != c.threadID) are now ignored.
Fixes#1181
* fix(codex): dispatch-level threadId filter for subagent notifications
Codex multiplexes subagent threads (e.g. memory consolidation) on
the same stdio pipe. Previously only thread/status/changed had a
threadId guard, but item/completed (agentMessage + final_answer),
turn/completed, and turn/started from subagent threads could still
trigger onTurnDone or contaminate output.
Move the threadId check to the top of handleRawNotification so all
notification handlers are protected. Remove the now-redundant
per-handler check on thread/status/changed.
Fixesmultica-ai/multica#1181
---------
Co-authored-by: fuxiao <fuxiao@zyql.com>
resolveTaskWorkspaceID only handled tasks linked via IssueID or
ChatSessionID. Tasks created by run_only autopilots (introduced in
#1028) have only AutopilotRunID set, so the resolver returned an empty
workspace ID, causing requireDaemonTaskAccess to respond with 404.
Add an AutopilotRunID branch that looks up the autopilot run, then
its parent autopilot, to obtain the workspace ID.
* fix(daemon): platform-aware Codex sandbox config to unbreak macOS network
On macOS, Codex's Seatbelt sandbox in workspace-write mode silently
ignores '[sandbox_workspace_write] network_access = true' (see
openai/codex#10390). That blocks DNS inside the sandbox, so 'multica
issue get' and other CLI calls fail with 'dial tcp: lookup ...: no such
host' — this is what caused MUL-963.
Changes:
- New server/internal/daemon/execenv/codex_sandbox.go: picks a sandbox
policy based on runtime.GOOS and the detected Codex CLI version.
Non-darwin or darwin with a known-fixed version keeps workspace-write
+ network_access=true; older darwin falls back to danger-full-access
and logs a warn with upgrade hint. The fix-version threshold is a
single constant (CodexDarwinNetworkAccessFixedVersion) so it's easy
to bump once upstream ships.
- Per-task config.toml now gets a 'multica-managed' marker block
(BEGIN/END comments) rewritten idempotently; user-owned keys outside
the markers are preserved. Legacy inline sandbox directives from
earlier daemon versions are stripped on migration.
- execenv.PrepareParams gains CodexVersion; execenv.Reuse takes a
codexVersion arg; daemon.go caches detected versions at registration
and threads them through to Prepare/Reuse.
- Replaces the old ensureCodexNetworkAccess tests with
platform-parameterised coverage (linux vs darwin, idempotency,
legacy-migration, policy matrix).
- docs/codex-sandbox-troubleshooting.md: symptom fingerprint table,
decision matrix, self-check commands, trade-offs.
Refs: MUL-963
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(daemon): hoist managed sandbox block above user tables (MUL-963)
Review on #1246 flagged that upsertMulticaManagedBlock appended the
managed block to EOF. If the user's config.toml ends inside a TOML table
(e.g. [permissions.multica] or [profiles.foo]), a trailing bare
sandbox_mode = "..." is parsed as a key of that preceding table, so
Codex silently ignores the policy the daemon meant to apply.
Two changes make the block position-independent:
- renderMulticaManagedBlock now emits only top-level key=value lines and
uses TOML dotted-key form (sandbox_workspace_write.network_access =
true) instead of opening a [sandbox_workspace_write] header. The block
therefore neither inherits from nor leaks into any surrounding table.
- upsertMulticaManagedBlock always hoists the block to the top of the
file (stripping any previously written managed block first), so the
sandbox_mode line is always at the TOML root regardless of what the
user put below it. This also migrates configs written by the original
PR #1246 logic where the block was trapped behind a user table.
Added tests for the regression scenario (pre-existing [permissions.*]
table) and the legacy-trailing-block migration; updated the existing
Linux default test and the troubleshooting runbook to reflect the
dotted-key form.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: CC-Girl <cc-girl@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Autopilot was formatting the triggered-at timestamp with time.RFC3339
(e.g. "2026-04-16T14:54:32Z"), which is hard to read and confusing for
users in non-UTC timezones because the "Z" suffix looks like an error
instead of a timezone indicator.
Switch to a human-readable format ("2026-04-16 14:54 UTC") so only the
hour differs from local time; minutes match across timezones, making
the value easy to reconcile at a glance.
Fixesmultica-ai/multica#1197.
- Migration 046 adds UNIQUE(workspace_id, name) with dedup (keep most recently updated)
- CreateAgent handler returns 409 Conflict scoped to constraint name agent_workspace_name_unique
- Dedup verified as (0 rows) against worktree DB; rerun against staging/production before applying
- Down migration drops the constraint only; deleted rows and cascaded data are not restored
Co-authored-by: Anup Joy <joyanup@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The issue:created subscriber listener type-asserted payload["issue"] to
handler.IssueResponse, but autopilot publishes the issue as
map[string]any (via service.issueToMap). The assertion failed silently,
so no subscribers (including the creator) were ever added to autopilot
issues — meaning creators received no notifications when their
autopilot run produced comments or status changes.
Add an extractIssueFields helper that accepts either format and use it
in both the issue:created and issue:updated listeners. Mirrors the
dual-format pattern already used by the comment:created listener.
Previously shouldEnqueueOnComment suppressed agent triggers on done/
cancelled issues, requiring an explicit @mention to resume the
conversation. The gate was non-obvious and confused users who expected
a regular reply to wake the agent up.
Drop the status check — comments are conversational and should wake
the agent up at any status. @mention already bypasses all gates, so
behavior for mentions is unchanged.
Refs multica-ai/multica#1205
* fix(daemon): normalize hostname by stripping .local mDNS suffix
Daemons started via different methods (standalone CLI vs desktop app
bundled binary) resolve the hostname differently on macOS — one gets
'computer' and the other 'computer.local'. This caused duplicate runtime
registrations for the same machine.
Stripping the .local suffix at the point of hostname resolution ensures
both always register under the same identifier.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(daemon): move empty-host fallback to after .local trim; fix Makefile @ prefix
- Reorder: TrimSuffix runs first, then empty-check, so a hostname of
just ".local" doesn't propagate as an empty daemon_id/device_name
- Add missing @ prefix on migrate command in Makefile so it isn't
echoed twice at startup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(routing): rename /new-workspace to /workspaces/new + extend reserved slug list
Two related changes:
1. Rename the global workspace-creation route from /new-workspace to
/workspaces/new. The hyphenated word-group `new-workspace` is a
common user workspace name (last deploy was blocked by a real user
with exactly this slug). Industry consensus from auditing Linear,
Vercel, Notion, Slack, GitHub: zero major SaaS uses hyphenated
word-group root routes — they all use single words or `/{noun}/{verb}`
pairs. Reserving the noun `workspaces` automatically protects the
entire `/workspaces/*` subtree, so future workspace-related routes
(`/workspaces/{id}/edit`, `/workspaces/{id}/billing`, etc.) need no
additional reserved slugs or audit migrations.
2. Extend the reserved slug list to cover the minimal set recommended by
the URL-design audit: full auth flow vocab, RFC 2142 mailbox names
(postmaster, abuse, noreply...), hostname confusables (mail, ftp,
static, cdn...), and likely-future platform routes (docs, support,
status, legal, privacy, terms, security, etc.). Production data
audit confirmed zero conflicts for every newly added slug, so
migration 047 (the safety net) passes cleanly.
Slugs intentionally NOT added despite being in scope of the audit:
admin, multica, new, setup, www. Each has one production workspace
already using it; adding them now would block deploy. They will be
handled in a follow-up PR via owner outreach + targeted rename.
Also adds a CLAUDE.md convention rule: new global routes MUST use a
single word or `/{noun}/{verb}` pair, never hyphenated word groups.
This prevents the pattern from regenerating itself.
This PR does NOT resolve the currently-blocked prd deploy — that requires
the existing `slug='new-workspace'` workspace (owner: Dhruv Raina) to be
renamed by ops. After that workspace is renamed and migration 046 passes,
this PR's migration 047 will also pass on its first run.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* review: drop migration 046, sweep stale comments, drive reserved test from map
Address code review on PR #1188:
1. Delete migration 046 (audit_new_workspace_slug). It audits "new-workspace"
which is no longer a reserved slug after this PR's rename. Removing 046
has an unexpected upside: it directly unblocks the currently-stuck prd
deploy. Migration 046 had never successfully applied (it was the source
of the deploy block); the audit-only nature means down-rollback is a
no-op. The user workspace previously caught by 046 (slug='new-workspace',
owner: Dhruv Raina) is now safe — `new-workspace` is no longer reserved,
so the slug correctly resolves to that workspace and the global route
`/workspaces/new` doesn't shadow it.
2. Refactor workspace_test.go to drive its reserved-slug list from the
reservedSlugs map directly via `for slug := range reservedSlugs`. The
previous hand-copied list was already drifting (40-ish entries vs 58 in
the map). Now drift is impossible.
3. Sweep ~10 stale `/new-workspace` references in code comments to
`/workspaces/new`. Comments only — runtime unchanged. The references
in reserved-slugs.ts/workspace_reserved_slugs.go and CLAUDE.md are
intentionally kept as anti-pattern examples ("don't add hyphenated
word-group root routes like /new-workspace").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(daemon): allow startup with zero workspaces
The daemon used to fail fast with "no runtimes registered" when the
initial workspace sync returned zero workspaces. This masked a latent
bug: a newly-signed-up user has no workspaces yet, so the daemon would
crash immediately after login instead of waiting for the first
workspace to be created.
workspaceSyncLoop already polls every 30s (daemon.go:107, 365) to
discover new workspaces — the fail-fast check at startup was bypassing
this dynamic discovery. Remove the check so the daemon stays resident
and picks up the first workspace whenever it appears.
PR #1001 partially addressed this for the "server has workspaces but
local CLI config is empty" case. This finishes the job for the true
zero-workspace state, which until now was masked by the onboarding
wizard always creating a workspace before the daemon started.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(views): extract CreateWorkspaceForm for reuse
Modal and the upcoming /new-workspace page share the same form +
mutation + slug validation. Extract to a shared component so they
can't drift.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(views): add NoAccessPage for unknown or inaccessible workspace slugs
Rendered when the URL slug doesn't resolve to a workspace the user has
access to. Deliberately doesn't distinguish 404 vs 403 to avoid letting
attackers enumerate workspace slugs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(paths): add /new-workspace route and reserve slug on both sides
Adds paths.newWorkspace() builder, registers /new-workspace as a global
(pre-workspace) prefix, and reserves the "new-workspace" slug on both
frontend and backend (kept in sync per convention). Existing
"onboarding" reservation retained — removing it would desync FE/BE
and leaves no future fallback if an onboarding route is revived.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(migrations): audit no existing workspace uses 'new-workspace' slug
Migration 046 blocks deploy if any workspace in the DB has slug =
'new-workspace', which would shadow the new global workspace creation
route at /new-workspace.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add /new-workspace route on web and desktop
Renders the CreateWorkspaceForm as a full-page workspace creation flow,
used as the destination for first-time users with zero workspaces.
Replaces the 4-step onboarding wizard with a single form.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: show NoAccessPage on unknown workspace slug, hold null during active removal
Layouts render NoAccessPage when the URL slug doesn't resolve to an
accessible workspace — except when the slug previously resolved during
this layout instance's lifetime.
URL and cache are two asynchronous signals: there will always be a
short window where the URL still points at the old workspace but the
cache has already been invalidated (e.g. just after a delete/leave
mutation, or a realtime workspace:deleted event). Rendering
NoAccessPage during that window would flash "Workspace not available"
with recovery buttons in front of a user who just deleted the
workspace themselves — jarring and wrong.
useWorkspaceSeen classifies the two cases:
- slug was seen before, now gone → user's intent is changing (caller
is navigating away); render null, no flash
- slug never seen → user is genuinely looking at an inaccessible
workspace (stale bookmark, revoked access, link from a former
teammate); render NoAccessPage with recovery options
NoAccessPage deliberately does not distinguish 404 vs 403 to avoid
letting attackers enumerate workspace slugs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: redirect zero-workspace users to /new-workspace instead of /onboarding
Switches 8 call sites and the CLI:
- Web: login, auth callback, landing redirect-if-authenticated
- Desktop: routes.tsx IndexRedirect
- Shared: dashboard guard, invite page fallback, workspace-tab on delete,
realtime sync on workspace loss
- CLI: cmd_login.go waitForOnboarding now opens /new-workspace
Also adds /new-workspace to navigation store's lastPath exclusion list
so it doesn't get persisted as a 'last visited' page.
Adds a desktop App.tsx effect that restarts the daemon when workspace
count transitions 0 → ≥1, so first-workspace creation triggers
immediate daemon pickup rather than waiting up to 30s for the daemon's
workspaceSyncLoop.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: remove onboarding flow
The 4-step onboarding wizard (workspace → runtime → agent → demo issues)
is replaced by:
- /new-workspace: a single-page workspace creation form (Phase 3)
- NoAccessPage: explicit feedback when a slug doesn't resolve (Phase 4)
- daemon zero-workspace bootstrap (Phase 1) so the daemon doesn't
crash before the user creates their first workspace
- desktop daemon restart on first workspace creation (Phase 5) for
instant pickup instead of the 30s workspaceSyncLoop tick
Deletions:
- packages/views/onboarding/ (OnboardingWizard + 4 step components + tests)
- apps/web/app/(auth)/onboarding/page.tsx
- apps/desktop/src/renderer/src/components/onboarding-gate.tsx (+test)
- OnboardingGate wrapper in desktop-layout.tsx
- OnboardingRoute + /onboarding route in desktop routes.tsx
- paths.onboarding() builder + /onboarding from GLOBAL_PREFIXES
- packages/views/package.json onboarding export
- /onboarding from navigation store's EXCLUDED_PREFIXES
Retained (intentional):
- 'onboarding' in RESERVED_SLUGS (both FE + BE) — kept for FE/BE sync
and future-proofing if /onboarding is ever revived
Also drops 4 demo issues that onboarding used to create on the new
workspace ('Say hello', 'Set up repo', etc.). New workspaces are now
fully empty; all list views already render empty-state UI correctly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: clean stale 'onboarding' references in comments and CLI helpers
Batch cleanup of references to the removed onboarding flow:
- 13 comment sites mentioning 'onboarding' updated to reflect the
new /new-workspace flow or removed where no longer accurate
- CLI waitForOnboarding renamed to waitForWorkspaceCreation (function
name + docstring); behavior unchanged
The 'onboarding' reserved slug entries (frontend + backend) are
intentionally retained — see prior commit rationale.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(views): extract shared NewWorkspacePage shell
The web (/new-workspace) and desktop (NewWorkspaceRoute) pages had
identical outer layout — same container, heading, and copy — with only
the onSuccess navigation primitive differing. That's exactly the
No-Duplication Rule pattern: extract the shared UI, inject the
platform-specific behavior.
The apps now only own the thin auth guard (web needs it, desktop
routes below WorkspaceRouteLayout already handle it) and the
onSuccess → navigate call.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: remove rollback compat layer and tighten daemon restart trigger
Two cleanup items:
1. Drop localStorage['multica_workspace_id'] double-write in both
workspace layouts. That write was added as a rollback safety net
for the workspace-slug URL refactor (PR #1138) — the refactor has
since landed and stabilized, so the compat shim is no longer
needed. Per CLAUDE.md: don't keep compat layers beyond their
purpose.
2. Tighten the desktop daemon-restart trigger. The previous ref-based
logic fired a restart on any 0→1 workspace-count transition,
including account switches (user A logout → user B login). Scope
it precisely to 'this session started with zero workspaces and
just gained one' using a three-state ref (null=undecided,
true=empty-start, false=already-restarted-or-started-nonempty).
Account switches are already handled by daemon-manager.ts on
token change, so this avoids a redundant restart there.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(auth): redirect to /login on logout and unauthenticated workspace visits
Two gaps previously left users stuck on blank workspace pages:
1. app-sidebar logout() cleared all state but never moved the URL. The
current path is /{workspaceSlug}/... which has no meaning without
auth; the workspace layout would then see user=null, render null
(via the hasBeenSeen short-circuit), and the user saw a blank page
thinking logout didn't work.
2. The workspace layouts (web + desktop) had no !user handling at all.
Any path that leaves user=null — token expiration, cross-tab logout,
or fresh visit to a workspace URL without a session — resulted in
the same blank screen.
Fix:
- app-sidebar.logout() explicitly push(paths.login()) after authLogout()
to cover the primary (user-initiated) logout path.
- Both workspace layouts get a defensive useEffect that redirects to
/login whenever auth has settled and user is null. Covers token
expiration, realtime logout, and any other silent session loss.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GetWorkspaceUsageByDay and GetWorkspaceUsageSummary had the same date
attribution bug as the runtime endpoint fixed in #1167: they bucketed
and filtered on agent_task_queue.created_at (enqueue time), so a task
that queued at 23:58 and reported usage at 00:05 was attributed to the
prior day, and ?days=N became a rolling now()-N window that clipped the
morning of the earliest day returned.
Switch both queries to task_usage.created_at (~= task completion time)
and snap the since cutoff to start-of-day via DATE_TRUNC, mirroring
ListRuntimeUsage.
These endpoints have no frontend caller today, but per offline
discussion they will back the upcoming workspace-level usage dashboard.
Fix preemptively so the dashboard inherits correct numbers.
Add a regression test covering both endpoints with the same
cross-midnight + earliest-day-cutoff scenarios used for runtime usage.
* refactor(runtime): derive runtime usage from task_usage only
The daemon used to scan each runtime's local CLI log directory every 5
minutes (Claude Code, Codex, OpenCode, OpenClaw, Hermes) and post daily
aggregates to /api/daemon/runtimes/{id}/usage. Those directories are
shared with the user's own local CLI sessions, so the user's personal
usage was being counted as Daemon-executed usage. Cursor and Gemini had
no scanner at all, so their runtime-level aggregates were always zero.
Switch GetRuntimeUsage to aggregate task_usage (already scoped to
Daemon-executed tasks) via agent_task_queue.runtime_id. Single source of
truth; Cursor/Gemini/Copilot get runtime usage for free; no reliance on
external CLI log formats.
Removes:
- server/internal/daemon/usage/ (all scanners)
- Daemon.usageScanLoop + providerToRuntimeMap
- Client.ReportUsage
- ReportRuntimeUsage handler + POST /api/daemon/runtimes/{id}/usage
- UpsertRuntimeUsage / GetRuntimeUsageSummary queries
- runtime_usage table (migration 046)
Refs: MUL-786
* fix(runtime): bucket daily usage by task_usage.created_at, not enqueue time
ListRuntimeUsage was aggregating by DATE(atq.created_at) and filtering
on atq.created_at. agent_task_queue.created_at is the enqueue timestamp,
which drifts from actual token-production time: a task queued at 23:58
and executed at 00:05 was attributed to yesterday; a task sitting in
the queue overnight was counted on the queue day.
The ?days=N cutoff also became a rolling window (now() - N) instead of
a calendar-day boundary, silently clipping the morning of the earliest
day returned.
Switch bucket + filter to task_usage.created_at (~= task completion /
usage-report time) and snap the since cutoff to start-of-day via
DATE_TRUNC.
Add a regression test covering both scenarios: cross-midnight task
attributes to the day tokens were reported, and the earliest day's
pre-cutoff rows are still included.
Every other backend (Claude, Gemini, OpenCode, OpenClaw, Hermes) honors
ExecOptions.ResumeSessionID — only Codex didn't. That's why users on
the Codex runtime saw each new comment on an issue start a fresh Codex
conversation: the daemon persists Result.SessionID per (agent, issue)
and passes it back as PriorSessionID, but codex.go always called
thread/start and never populated SessionID, so the value round-tripped
as empty.
Wire the missing half:
- Extract startOrResumeThread on codexClient. When ResumeSessionID is
set, call thread/resume (per the Codex app-server protocol), passing
only cwd / model / developerInstructions overrides so the thread
keeps its persisted model and reasoning effort. If resume fails
(unknown thread, schema drift, transport error) fall back to
thread/start so the task still runs on a fresh thread.
- Surface the live threadID as Result.SessionID on the final emit so
the daemon stores it and feeds it back into ResumeSessionID on the
next claim.
Tests drive the new helper through the fake stdin harness, covering:
fresh start, successful resume, fallback on resume error, fallback
when resume returns no thread ID, and surfacing of thread/start
failures.
Problem
-------
The v2 workspace URL refactor (#1141) switched the frontend from sending
X-Workspace-ID (UUID) to X-Workspace-Slug. The workspace middleware was
updated to accept the slug and translate it via GetWorkspaceBySlug.
But the handler package maintained a PARALLEL resolver
(`resolveWorkspaceID` in handler.go) used by endpoints that sit outside
the workspace middleware — and that resolver was never updated. It only
checked context / ?workspace_id / X-Workspace-ID, never the slug.
/api/upload-file is the one production route that hit the broken path:
it's user-scoped (not behind workspace middleware) because it also
serves avatar uploads (no workspace). Post-refactor requests from the
frontend arrived with only X-Workspace-Slug; the handler resolver
returned "", the code fell into the "no workspace context" branch, and
every file upload since v2 landed in S3 with no corresponding DB
attachment row — files orphaned, invisible to the UI.
Root cause is structural: two resolvers doing the same job, written
independently, diverged silently when one was updated.
Fix
---
Collapse to a single shared helper. middleware.ResolveWorkspaceIDFromRequest
is the new canonical resolver; both the middleware's internal
`resolveWorkspaceUUID` (for middleware gating) and the handler-side
`(h *Handler).resolveWorkspaceID` (promoted from a package function)
now delegate to it. Priority order matches what the middleware has had
since v2: context > X-Workspace-Slug header > ?workspace_slug query >
X-Workspace-ID header > ?workspace_id query.
Impact analysis
---------------
47 call sites of the old `resolveWorkspaceID(r)` are renamed to
`h.resolveWorkspaceID(r)`. 46 of them sit behind workspace middleware,
so they hit the context fast path and see zero behavior change. The
one caller that actually gains capability is UploadFile — which now
correctly recognizes slug requests and creates DB attachment rows.
Tests
-----
- New table-driven unit test for ResolveWorkspaceIDFromRequest covers
all priority levels and the unknown-slug fallback.
- Regression tests for UploadFile: once with X-Workspace-Slug only
(the broken path), once with X-Workspace-ID only (legacy CLI/daemon
compat path). Both assert that a DB attachment row is created.
- Full Go test suite passes; typecheck + pnpm test unaffected.
Plan
----
See docs/plans/2026-04-16-unify-workspace-identity-resolver.md for the
full first-principles writeup.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(agent): add GitHub Copilot CLI backend
Integrate Copilot CLI as a new agent backend using the stable
`-p` JSONL mode (`--output-format json`), following the same
spawn-CLI-scan-JSONL pattern established by claude.go.
Backend (server/pkg/agent/copilot.go):
- Spawn `copilot -p <prompt> --output-format json --allow-all-tools --no-ask-user`
- Parse streaming JSONL events (system/assistant/user/result/log)
- Extract session ID for resume support (`--resume <id>`)
- Accumulate per-model token usage for billing
- Filter blocked args to prevent protocol-critical flag overrides
Daemon config:
- Probe MULTICA_COPILOT_PATH / MULTICA_COPILOT_MODEL env vars
- Copilot uses AGENTS.md (native discovery) and default skills path
Frontend:
- Add Copilot logo SVG and provider switch case
Tests: 14 unit tests covering arg building, event parsing, usage
accumulation, and edge cases. All Go + TS checks pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(daemon): add restart subcommand, make daemon uses it
- `daemon start` keeps original behavior: errors if already running
- `daemon restart` stops existing daemon then starts fresh
- `make daemon` now runs `daemon restart --profile local`
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(copilot): address review nits 1-5
- Nit 1: Add MinVersions["copilot"] = "1.0.0"
- Nit 2: Seed activeModel from session.start.data.selectedModel (falls
back to opts.Model, then "copilot"). First-turn tokens now get correct
model attribution.
- Nit 3: Handle assistant.reasoning/reasoning_delta → MessageThinking,
reasoningText in assistant.message → MessageThinking,
session.warning → MessageLog{warn}
- Nit 4: Extract handleCopilotEvent() method shared by production and
tests — no more duplicated switch body that can drift
- Nit 5: Deltas write to output buffer as defense-in-depth; if process
dies before assistant.message, output is non-empty
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When codex emits `turn/completed` with `status="failed"` or a terminal
top-level `error` notification, the daemon previously treated the turn
as successfully completed, saw no accumulated text, and surfaced the
generic "codex returned empty output" — hiding the real reason (auth,
sandbox, API error, etc.).
Capture `turn.error.message` on failed turns and the `error.message`
from non-retrying top-level error notifications, then propagate them
through `Result.Error` with `finalStatus="failed"` so the daemon's
default branch reports the actual cause.
* feat(agent): add Cursor Agent CLI runtime support
Add cursor-agent as a new agent backend, following the same pattern as
existing providers. The implementation spawns cursor-agent CLI with
stream-json output, parses JSONL events into the unified Message type,
and supports session resume, usage tracking, and auto-approval (--yolo).
Changes:
- server/pkg/agent/cursor.go: cursorBackend implementation
- server/pkg/agent/cursor_test.go: unit tests for args, parsing, errors
- server/pkg/agent/agent.go: register "cursor" in New() factory
- server/internal/daemon/config.go: probe cursor-agent in PATH
- server/internal/daemon/execenv/context.go: cursor skill discovery path
- server/internal/daemon/execenv/runtime_config.go: AGENTS.md injection
- packages/views/.../provider-logo.tsx: cursor logo in UI
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(agent): address PR review for cursor backend
1. Fix token usage double-counting: usage is now taken exclusively from
"result" events (session totals). Per-message usage in "assistant"
events is intentionally ignored. "step_finish" usage is only used as
fallback when no "result" usage is available.
2. Remove dead code: isCursorUnknownSessionError() and its regex were
defined but never called. Removed along with corresponding test.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(agent): add missing CustomArgs, SystemPrompt, MaxTurns, and debug logging to cursor backend
- Add cursorBlockedArgs and filterCustomArgs support for safe custom arg passthrough
- Add --system-prompt and --max-turns flag support to buildCursorArgs
- Add debug logging of command args before execution (consistent with all other backends)
- Move stdout-close goroutine inside main goroutine (consistent with claude.go pattern)
- Add tests for SystemPrompt/MaxTurns and CustomArgs filtering
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* chore: make daemon uses local profile & update Cursor logo to official brand
- Makefile: make daemon now runs 'daemon start --profile local' for local dev
- Replace Cursor runtime logo with official brand SVG (removed background rect)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(agent): remove unsupported --system-prompt and --max-turns from cursor-agent
cursor-agent CLI does not support these flags. Instructions are already
injected via AGENTS.md and .cursor/skills/ files.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(agent): prevent step_finish + result usage double-counting in cursor
Split usage accumulation into separate stepUsage and resultUsage maps.
After stream ends, use resultUsage if available (session totals from
result event), otherwise fall back to stepUsage (sum of step_finish).
This prevents 2x counting when result.usage already includes totals.
Added table-driven test covering: result-only, step_finish-only,
step_finish+result (no double count), and multi-model scenarios.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* docs(agent): fix misleading comment on cursor -p flag
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: yushen <ldnvnbl@gmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat(agent): add Pi agent runtime support
Add Pi as a new agent runtime provider, following the established adapter
pattern. Pi CLI outputs JSONL events which are parsed for messages, tool
calls, and usage tracking.
Backend:
- New piBackend implementing the Backend interface (pi.go)
- Pi CLI discovery via MULTICA_PI_PATH env var or PATH lookup
- JSONL event stream parsing (agent_start, message_update, thinking_update,
tool_execution_start/end, agent_end)
- Usage scanner for ~/.pi/sessions/*.jsonl files
- Runtime config injection via AGENTS.md
- Skill injection to .pi/agent/skills/
Frontend:
- Pi provider logo (teal π icon)
- Pi label in transcript dialog
Docs:
- Updated all provider lists in README, CLI_INSTALL, and docs
* fix(agent): filter Pi usage scanner to agent_end events only
Address review feedback: restrict usage parsing to agent_end events
which contain cumulative totals, preventing potential inaccuracy if
Pi adds usage fields to other event types in the future.
* fix(agent): align Pi runtime with real CLI flags, event schema, and custom_args
- Flags: Pi's CLI uses `--mode json` (not `--output-format jsonl`), has no
`--yolo` (explicit `--tools` allowlist instead), takes the prompt as a
positional argument (not `-p <prompt>`), splits model as
`--provider <name> --model <id>`, and treats `--session` as a file path
that must exist before spawn.
- Event parsing: rewrite the stream event struct to match Pi's actual
JSON event schema (`message_update.assistantMessageEvent.delta`,
`turn_end.message.usage.{input,output,cacheRead,cacheWrite}`, etc.).
- Sessions: generate/persist session files under ~/.multica/pi-sessions/
and use the file path as the opaque SessionID returned to the daemon.
- Usage scanner: read assistant `message` events from the same session
files (Pi's session-file schema, distinct from the stdout stream).
- Custom args: consume `ExecOptions.CustomArgs` via `filterCustomArgs`
with a Pi-specific blocked set (`-p`, `--print`, `--mode`, `--session`)
so Pi matches the pattern shared by every other agent backend.
GetActiveTaskForIssue, CancelTask, ListTasksByIssue, and GetIssueUsage
accepted the issueId URL parameter and queried by it without verifying
that the issue belonged to the caller's X-Workspace-ID workspace. The
RequireWorkspaceMember middleware only proves membership in the header
workspace; it does not bind the path-parameter issue to it. A member of
workspace A could therefore enumerate tasks, cancel tasks, and read
usage metadata for any issue UUID in workspace B.
Route every issueId through loadIssueForUser (matching GetIssue and the
existing comment/subscriber handlers). For CancelTask additionally
verify that the task's IssueID matches the loaded issue — the task must
not only belong to the caller's workspace but also to the specific
issue named in the URL, and the access check must run before any
mutation.
Follow-up to MUL-899 / #1112.
Every other backend (claude, codex, opencode, openclaw, gemini) filters
opts.CustomArgs through a per-backend blocked map so protocol-critical
flags can't be overridden via the Create Agent UI. The hermes backend
appended CustomArgs directly to argv, so any future flag we add to the
map would be silently bypassed here.
Add hermesBlockedArgs (with 'acp' as the pinned subcommand) and route
CustomArgs through filterCustomArgs. Behaviour is identical for today's
use cases; the change prevents accidental protocol-flag overrides and
brings hermes in line with the other five backends.
Closes#1113
Co-authored-by: shaun0927 <shaun0927@users.noreply.github.com>
Adds TestGetIssueGCCheck_WithDaemonToken_CrossWorkspace alongside the
existing TestGetTaskStatus_WithDaemonToken_CrossWorkspace, covering:
- daemon token scoped to a different workspace → 404 (matches the
"issue not found" status, so no UUID enumeration oracle)
- daemon token scoped to the issue's workspace → 200 with status and
updated_at fields populated
Follow-up to #1121, which fixed the underlying IDOR reported in #1112
but did not ship a regression test. This gates the class of bug at CI
so the next handler to forget requireDaemonWorkspaceAccess will be
caught before merge.
After the slug-first URL refactor, the frontend sends X-Workspace-Slug
and the workspace middleware resolves it into a UUID stored in the
request context. The inbox handlers still read X-Workspace-ID directly
from the request header, which is now absent, so every inbox query ran
with an empty workspace_id and returned zero rows.
Switch all six inbox handlers to ctxWorkspaceID(r.Context()), matching
the pattern already used by chat / issue / project / autopilot. No
frontend changes required — the slug header path was already correct.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The daemon GC check endpoint did not verify the caller's access to the
issue's workspace, letting a daemon token or PAT scoped to workspace A
read issue status/updated_at for any issue UUID across the instance.
Mirror the pattern used by every other handler in daemon.go: look up
the issue's workspace and gate on requireDaemonWorkspaceAccess.
Closes#1112
Co-authored-by: shaun0927 <shaun0927@users.noreply.github.com>
Follow-up to #1126 (which closed the HTML-injection vector in the Body).
The Subject line is not HTML-rendered, so html.EscapeString would leak
literal entities into recipient inboxes. Instead:
- Strip control characters from workspace/inviter names (defense in depth
even though Resend also filters CR/LF).
- Cap each field at 60 runes so an attacker can't stuff a full phishing
pitch into a workspace name that gets sent from noreply@multica.ai.
Also extracts buildInvitationParams to make the sanitization logic
testable without mocking the Resend SDK, and adds a test covering:
- HTML escape behavior for script/attribute/anchor injection payloads
- Subject stripping of \r\n\t and other unicode controls
- Subject NOT being HTML-escaped (so "Acme & Co." stays literal)
- Subject length bounds
- Benign inputs pass through unchanged
Adds a note on SendVerificationCode that its body uses only
server-generated content, to prevent the same pitfall from creeping in.
Refs #1117
* fix(email): HTML-escape workspace/inviter names in invitation email
SendInvitationEmail interpolated workspaceName and inviterName directly
into the HTML body via fmt.Sprintf with no escaping. A workspace owner
who sets a name like '</h2><a href="https://evil.example">Click</a>'
can break the email structure and inject attacker-controlled links that
appear as part of the official Multica invitation.
Escape both values with html.EscapeString before interpolation. The
Subject line also gets the escaped variants since some transports render
HTML-entity-like sequences.
Closes#1117
* fix(email): use raw names in Subject, keep HTML-escape for body only
Email Subject is a plain-text context — applying html.EscapeString
turns "A&B" into "A&B" and "O'Brien" into "O'Brien" in the
recipient's inbox. Keep the escape for the Html body where it prevents
injection, but use the original values in Subject.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: shaun0927 <shaun0927@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Reapply "feat: workspace URL refactor + slug-first API identity (#1131)" (#1137)
This reverts commit 9b94914bc8.
* compat: legacy URL redirect + localStorage double-write for safe rollback
The first attempt at this refactor (#1131) was reverted because existing
users on old URLs (/issues, /projects, etc.) hit 404 immediately after
deploy, and rolling back left them with empty dashboards — the legacy
code reads localStorage["multica_workspace_id"] to attach a workspace
to API requests, but the new code had stopped writing that key.
Two compat layers added on top of the restored refactor:
1. proxy.ts now intercepts legacy route prefixes (/issues/*, /projects/*,
/agents/*, /inbox/*, /my-issues/*, /autopilots/*, /runtimes/*,
/skills/*, /settings/*). Logged-in users with a last_workspace_slug
cookie are 302'd to /{slug}/{rest}, preserving their deep link. Users
without the cookie bounce through / where the landing page picks a
workspace client-side. Unauthenticated users go to /login.
2. Both layouts now double-write the workspace id to the legacy
localStorage key on every workspace entry. New code ignores this key
— it exists solely so that if this PR ever gets reverted again, the
legacy build reading the key would still find the correct workspace
and avoid the empty-dashboard symptom users saw during the rollback.
Net effect: any direction of deploy ↔ rollback is now cache-compatible,
and any direction of old bookmark → new route resolves without 404.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(platform): defer rehydrateAllWorkspaceStores to a microtask
Same React 19 render-phase restriction that forced setCurrentWorkspace
to defer its subscriber notifications. rehydrateAllWorkspaceStores
synchronously calls each persist store's rehydrate, which setState()s
the store, which schedules updates on any subscribed component. When
the workspace layout's render-phase ref guard invoked this, React
complained that SearchCommand (a store subscriber) couldn't be
re-rendered while WorkspaceLayout was still rendering.
Fix: queueMicrotask the rehydrate loop and add a pending-flag guard so
rapid workspace switches coalesce into one rehydrate on the final slug.
Persist stores tolerate one microtask of staleness — they hold UI
preferences, not correctness-critical state.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>