mirror of
https://github.com/multica-ai/multica.git
synced 2026-07-05 13:29:44 +02:00
* fix(server): recognize official cloud by frontend host in daemon setup config The 'Add a computer' dialog builds its command from /api/config's daemon_server_url/daemon_app_url, falling back to 'multica setup' when both are empty. The official cloud is meant to omit them, but the omission only fired when MULTICA_PUBLIC_URL=https://api.multica.ai. When that env is unset the server URL defaults to the frontend origin and the old guard (which required serverURL host == api.multica.ai) didn't match, so the dialog emitted 'multica setup self-host --server-url https://multica.ai' — pointing the daemon backend at the frontend (no /health, no WebSocket proxy). Identify the official cloud by its frontend host alone (multica.ai / app.multica.ai) so a missing or misconfigured MULTICA_PUBLIC_URL can no longer leak the broken self-host command. Regression from #3474. * fix(cli): probe before persisting self-host config to preserve auth on failure setup self-host wrote a fresh CLIConfig{ServerURL, AppURL} (a full overwrite that drops the saved token) and only then probed the server, returning early on failure. A failed probe therefore logged the user out and left them unconnected, with no recovery in the same command. Probe first via persistSelfHostConfigIfReachable: an unreachable server leaves the existing config — and its token — untouched (failed setup = no-op). The prober is injected so both branches are unit-tested. * fix(daemon): serve health before preflight so daemon start readiness is accurate The CLI's 'daemon start' polls the health endpoint for 15s expecting status=running, but the daemon only began serving health after preflightAuth, whose initial workspace sync detects every configured agent's version by exec'ing it (~20s cold with 8 agents). Health served too late, so a perfectly healthy daemon printed 'may not have started successfully'. Start the health server right after resolveAuth (which still fails fast on a missing token) and before the slow preflight, so readiness reflects the daemon core being up rather than agent-version detection finishing. * fix(daemon): gate /health readiness so daemon start can't report a false start Serving health before preflightAuth fixed the false-negative (a healthy daemon printed "may not have started"), but health still returned status:"running" unconditionally — before preflight (PAT renew + workspace sync + runtime registration) had completed. `daemon start` and the desktop treat "running" as ready, so a slow or *failing* preflight could be misreported as a started daemon: setup prints "connected", then the process exits or hangs in agent-version detection with no runtime registered. That is harder to diagnose than the original false-negative. Split liveness from readiness: bind/serve the health port early (so callers see a live "starting" daemon instead of connection-refused), but report status:"starting" until d.ready is set after preflight, then "running". - daemon.go: add d.ready (atomic.Bool); set it true after the background loops launch, before pollLoop. - health.go: healthHandler reports "starting" until ready, else "running". - cmd_daemon.go: `daemon start` waits for "running" with a deadline raised to 45s (covers cold-start agent detection) and a clearer "still starting" message; new daemonAlive() helper treats both "running" and "starting" as a live daemon, so the already-running guard, restart, and stop act on a starting daemon and don't double-spawn or race its listener; `daemon status` shows "starting" distinctly. Older CLIs/desktop that only know "running" safely treat "starting" as not-ready (status != "running"), so no boundary break. Tests: health reports starting-then-running; daemonAlive truth table. Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): handle daemon "starting" health status in lifecycle The daemon now reports /health status:"starting" until preflight completes (liveness/readiness split). That made "starting" a new external contract of /health, but the Desktop daemon-manager only knew "running", so the readiness fix would have moved the CLI's false-negative into a Desktop start regression: - `daemon start` now blocks up to 45s waiting for readiness, but the Desktop spawned it via execFile({ timeout: 20_000 }). On a cold start (the ~20s agent detection this PR targets) Electron killed the CLI supervisor at 20s and reported a start failure, even though the detached daemon child kept booting — the UI flashed "stopped" then "running". Raise the timeout to 60s (must exceed the CLI's 45s startupTimeout). - The Desktop treated only raw status === "running" as a live daemon, so a daemon that was still "starting" (booting on its own or started via the CLI) showed as "stopped", and startDaemon() would spawn a second one — which the new CLI rejects as "already running", surfacing as a start error. Add daemonStatusAlive() (shared, pure, unit-tested) mirroring the Go daemonAlive() and use it for liveness: fetchHealth() surfaces a daemon-reported "starting" as state "starting" regardless of our own currentState; startDaemon()'s already-running guard and the restart-on-user-switch guard treat "starting" as an existing daemon. version-decision stays gated on "running" (readiness, not liveness) — unchanged. Verified: desktop typecheck, eslint, full vitest suite (193 tests) all pass. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>