mirror of
https://github.com/multica-ai/multica.git
synced 2026-07-05 13:29:44 +02:00
09f04847d376f692337ac9cc698499b7bf366dda
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
09f04847d3 | feat(server): redis-backed runtime liveness with DB fallback (#2121) | ||
|
|
b1345685a3 |
fix(task): rerun starts a fresh session, skip poisoned resume (#1928)
* fix(task): rerun starts a fresh session, skip poisoned resume
When a task ended in a known agent fallback ("I reached the iteration
limit and couldn't generate a summary.", "Put your final update inside
the content string. Keep it concise.") the (agent_id, issue_id) resume
lookup would still pick that session, so a manual rerun inherited the
poisoned state and reproduced the same bad output.
Two complementary guards:
1. Daemon classifies poisoned terminal output and routes it through the
blocked path with failure_reason set ('iteration_limit' /
'agent_fallback_message'). GetLastTaskSession excludes failed tasks
with those reasons, so even comment-triggered tasks no longer resume
them. Tasks that failed mid-flight (timeout, runtime_recovery, etc.)
are still resumable, preserving MUL-1128's auto-retry contract.
2. Manual rerun marks the new task force_fresh_session=true. The daemon
claim handler skips the resume lookup entirely when the flag is set,
capturing the user-intent signal that "the prior output was bad" even
when poisoned classification misses a future fallback wording.
Auto-retry of orphaned mid-flight failures (MaybeRetryFailedTask →
CreateRetryTask) does not take this path, so it keeps resuming.
Tests: classifyPoisonedOutput unit test; integration tests assert the
SQL filter excludes poisoned classifiers, RerunIssue flips the flag,
and the normal enqueue path leaves it false.
Co-authored-by: multica-agent <github@multica.ai>
* fix(daemon): cap poisoned-output matcher to short trimmed text
GPT-Boy review on MUL-1630: the previous strings.Contains match would
classify any output that quoted the marker substring — including a
review/analysis that simply discussed the marker itself. Real fallback
messages are short single-sentence affairs, so cap the candidate at
~one paragraph and trim whitespace before matching. Adds regression
tests covering a long quoting review and a marker buried in a long
real conclusion; both must stay classified as completed.
Co-authored-by: multica-agent <github@multica.ai>
* fix(migrations): rename 065 force_fresh_session → 066 to clear collision
main introduced 065_project_resources after this branch was cut, so
both files shared the 065_ prefix. The readiness check
(server/cmd/server/health.go → migrations.LatestVersion) takes the
last entry by lexical order, which is 065_project_resources, leaving
this branch's 065_force_fresh_session unguarded — a deploy that
applied project_resources but not force_fresh_session would still
report ready, and the next enqueue / rerun / claim would crash on
"column force_fresh_session does not exist".
Renaming to 066_force_fresh_session puts it strictly after
project_resources so readiness blocks until it's applied.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
|
||
|
|
f745a3bbbe |
feat(agent): presence v3 + execution log + trigger summary (#1823)
* refactor(views): migrate agent/runtime/skill lists to TanStack DataTable
Replace the per-page CSS Grid + minmax(min, fr) + sticky-first-col + truncate
implementation with a TanStack Table backend rendered through a Dice UI-style
DataTable shell. Column widths are now px-based via column.size, so cells
no longer shrink or auto-truncate as the viewport narrows; when the sum of
columns exceeds the viewport, the container scrolls horizontally instead.
- Add @tanstack/react-table to the catalog (8.21.3) and wire it into
packages/ui (dep) and packages/views (peerDep).
- packages/ui: new DataTable + DataTableColumnHeader + lib/data-table.ts
(getColumnPinningStyle), adapted from Dice UI's registry. The shell
renders <table> directly (skipping shadcn's <Table> wrapper) so its own
outer overflow controls both axes — no nested overflow conflicts.
- packages/views: each list now declares ColumnDef[] with explicit
cell renderers. Row click navigates to detail via onRowClick (instead of
wrapping <tr> in <a>, which is invalid HTML); kebab dropdowns
stopPropagation so they don't trigger the row navigation.
- Drop the previous AGENT_LIST_GRID / GRID_WITH_OWNER / ROW_GRID
templates and the sticky-first-col / subgrid mechanics that came with
them. agent-list-item.tsx is removed; runtime-list.tsx and
skills-page.tsx are trimmed to thin wrappers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(agent): cap description at 255 chars (db + api + ui)
Symmetric enforcement across DB, server, and UI:
- Migration 060: pre-flight truncate of any oversize rows, then ADD
CONSTRAINT NOT VALID + VALIDATE CONSTRAINT so the new check doesn't
block writes during validation.
- Server handler validates utf8.RuneCountInString on Create/Update and
rejects over-limit input with 400.
- Front-end gets AGENT_DESCRIPTION_MAX_LENGTH in core/agents/constants
(single source of truth shared by the create dialog + edit modal +
test suite) and a CharCounter component that warns at 90% and errors
past the cap.
- Description editor moves from a 288px popover to a roomy modal.
Editor body is mounted only while the dialog is open, so the local
draft state is locked in at mount time and never reset by an external
WS update — the React-recommended replacement for the
useEffect(reset, [value]) anti-pattern.
Counted in code points everywhere (rune count / spread length /
char_length) so multibyte input agrees across all three layers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(views): data-table polish across runtime + skill lists
Builds on the DataTable migration in
|
||
|
|
0b1333fb00 |
feat(server): orphan-task recovery + auto-retry + manual rerun (MUL-1128) (#1476)
* feat(server): orphan-task recovery + auto-retry + manual rerun (MUL-1128)
When the daemon process crashed mid-task the issue was stuck at
in_progress for up to 2.5h: the in-flight task timeout was the only
mechanism that ever moved the row, and the runtime heartbeat sweeper
only fires after the runtime stays offline for 45s — a quick restart
beats both windows.
This change implements the A+B plan from the issue thread:
A. lifecycle hygiene
- migration 055 adds attempt / max_attempts / parent_task_id /
failure_reason / last_heartbeat_at to agent_task_queue
- new daemon-auth endpoint POST /runtimes/{id}/recover-orphans:
daemon calls it on every register so the server fails any
dispatched/running tasks the previous process left behind
- new daemon-auth endpoint POST /tasks/{id}/session: persists the
agent's session_id + work_dir mid-flight so a crash doesn't
lose the resume pointer (claude+codex emit MessageStatus with
SessionID; daemon forwards on the first one it sees)
- FailAgentTask / FailStaleTasks / FailTasksForOfflineRuntimes
now set failure_reason ('agent_error' / 'timeout' /
'runtime_offline')
B. auto-retry with resume context
- TaskService.MaybeRetryFailedTask spawns a fresh queued attempt
carrying parent's session_id/work_dir when the failure reason
is infrastructure-shaped (timeout, runtime_offline,
runtime_recovery) and attempt < max_attempts; skips autopilot
- wired into the runtime sweeper paths and TaskService.FailTask
so the user transparently sees a new in_progress run instead of
a stuck row
- new user-auth POST /api/issues/{id}/rerun + multica issue rerun
CLI for the manual escape hatch
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(server): address PR review for orphan-task recovery (MUL-1128)
Three review-must-fix items on top of the A+B implementation:
1. recover-orphans now funnels through TaskService.HandleFailedTasks,
the same shared post-failure pipeline used by the runtime sweeper.
This guarantees task:failed events are emitted, agent status is
reconciled, and issues stuck in_progress with no remaining active
task are reset to todo even when no auto-retry is created
(max_attempts exhausted, autopilot, non-retryable reason).
2. RerunIssue now uses CancelAgentTasksByIssueAndAgent, scoped to the
issue's current assignee. The previous implementation called
CancelAgentTasksByIssue, which would collateral-cancel parallel
@-mention agents on the same issue.
3. GetLastTaskSession now considers both completed and failed tasks
(mirroring GetLastChatTaskSession), ordering by the most recent
timestamp. With UpdateAgentTaskSession pinning session_id/work_dir
mid-flight, an auto-retry or manual rerun of a daemon-crash failure
now actually resumes the prior conversation context instead of
starting fresh — matching the stated B-branch behaviour.
go build / go vet pass; the existing service and agent test suites pass.
runtime_sweeper / handler integration tests require a local DB with the
055 migration (and the pre-existing 050 first_executed_at column).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
||
|
|
637bdc8eb3 |
feat(analytics): full PostHog pipeline + 6 funnel events (MUL-1122) (#1367)
* feat(analytics): add PostHog client with async batch shipping Introduces server/internal/analytics, the shipping layer for the product funnel defined in docs/analytics.md. Capture is non-blocking — events are enqueued into a bounded channel and a background worker batches them to PostHog's /batch/ endpoint. A broken backend drops events rather than blocking request handlers. Local dev and self-hosted instances run a noop client until the operator sets POSTHOG_API_KEY. This is PR 1 of MUL-1122; signup and workspace_created emission land in the follow-up commit so this change is independently reviewable. * feat(server): emit signup and workspace_created analytics events Wires analytics.Client through handler.New and main, then emits the first two funnel events: - signup fires from findOrCreateUser (which now reports isNew), covering both the verification-code and Google OAuth entry points — a single emission site guarantees Google signups aren't missed. - workspace_created fires after the CreateWorkspace transaction commits, with is_first_workspace computed from a post-commit ListWorkspaces count so we can distinguish fresh-user activation from returning-user expansion. Tests use analytics.NoopClient so nothing ships from test runs. PR 1 of MUL-1122; runtime_registered and issue_executed follow in later PRs per the plan. * refactor(analytics): drop is_first_workspace from workspace_created Stamping "is this the user's first workspace?" at emit time races under concurrent CreateWorkspace requests: two transactions committing close together can both read a post-commit count greater than one and both emit false. Fixing it at the SQL layer requires a schema change we don't want in PR 1. PostHog answers the same question exactly from the event stream (funnel on "first time user does X" / cohort on $initial_event), so removing the property loses no information and makes the emit side race-free. * docs(analytics): document self-host safety defaults Spell out why self-hosted instances never ship events upstream by default (empty POSTHOG_API_KEY → noop client) and explain how operators can point at their own PostHog project without any code change. * feat(analytics): emit runtime_registered, issue_executed, team_invite_* Three server-side funnel events, all gated on first-time state transitions so retries and re-runs don't inflate the WAW buckets: - runtime_registered fires from DaemonRegister when UpsertAgentRuntime reports (xmax = 0) — i.e. the row was inserted, not updated. Heartbeats and re-registrations stay silent. - issue_executed fires from CompleteTask after an atomic UPDATE issue SET first_executed_at = now() WHERE id = $1 AND first_executed_at IS NULL flips the column for the first time. Retries, re-assignments, and comment-triggered follow-up tasks hit the WHERE clause and no-op. Carries nth_issue_for_workspace so the ≥1/≥2/≥5/≥10 buckets filter without extra queries. - team_invite_sent fires from CreateInvitation and team_invite_accepted from AcceptInvitation, closing the expansion funnel. Adds a 050 migration for issue.first_executed_at plus a partial index so the workspace-scoped executed-count query doesn't scan the never-executed tail. * feat(config): surface PostHog key via /api/config Extends AppConfig with posthog_key / posthog_host sourced from env on every request (so operators can rotate the key via secret refresh without a restart). Reading the key off the server — rather than baking it into the frontend bundle via NEXT_PUBLIC_* — means self-hosted instances inherit the blank key automatically and never ship events upstream. * feat(analytics): wire posthog-js identify + UTM capture on the client Adds @multica/core/analytics — a thin wrapper around posthog-js that owns attribution capture and identity merge. Posthog-js config comes from /api/config (not NEXT_PUBLIC_*), so self-hosted instances whose server returns an empty key automatically run the SDK inert. captureSignupSource stamps a multica_signup_source cookie with UTM params and the referrer's origin (never the full referrer — that can leak OAuth code/state in the callback URL). The backend signup event reads this cookie on new-user creation. Identity flows: - auth-initializer fires identify() right after getMe() resolves, on both cookie and token paths. A getConfig/getMe race is handled by buffering a pending identify inside the analytics module and flushing it once initAnalytics finishes. - auth store calls identify() on verifyCode / loginWithGoogle / loginWithToken and resetAnalytics() on logout so the next login merges cleanly without bleeding events. * docs(analytics): describe runtime_registered, issue_executed, invite events Fills in the schema for the remaining funnel events. Captures the design commentary that belongs next to the contract rather than in a PR description — in particular why issue_executed uses the atomic first_executed_at flip instead of counting task-terminal events, and why runtime_registered relies on xmax = 0 rather than a query-then-write. * fix(analytics): drop non-atomic nth_issue_for_workspace from issue_executed Computing the workspace's Nth-issue ordinal at emit time is not atomic under concurrent first-completions — two transactions can both run MarkIssueFirstExecuted, then both run CountExecutedIssuesInWorkspace, and both observe count=1 before either has committed, so both events go out stamped as n=1. Serialising it would mean a per-workspace advisory lock or a SERIALIZABLE-isolated tx; PostHog answers the same question exactly at query time via row_number() partitioned by workspace_id, so the emit-time property adds risk without adding information. Removes the property from analytics.IssueExecuted, deletes the unused CountExecutedIssuesInWorkspace query, and regenerates sqlc. The partial index stays — any future workspace-scoped executed-issue query will want it. * fix(analytics): wire $pageview and harden signup_source cookie payload Two frontend fixes from the PR review: - PageviewTracker, mounted under WebProviders, fires capturePageview on every Next.js App Router path / query-string change. Without this the capturePageview helper in @multica/core/analytics was never called and the acquisition funnel's / → signup step was empty. - captureSignupSource now caps each UTM / referrer value at 96 chars *before* JSON.stringify, and drops the whole cookie when the serialised payload still exceeds 512 chars. Previously the overall slice(0, 256) could leave a half-JSON string on the wire that neither the backend nor PostHog could parse. Both capturePageview and identify now buffer a single pending call when fired before initAnalytics resolves — otherwise the initial "/" pageview and same-turn login identify race the /api/config fetch and get dropped. resetAnalytics clears both buffers so a logout→login cycle stays clean. * fix(analytics): URL-decode signup_source cookie on read Go does not URL-decode Cookie.Value automatically, so the frontend's JSON-then-encodeURIComponent payload was landing in PostHog as percent-encoded garbage (%7B%22utm_source...). Unescape on read so the backend receives the original JSON string the frontend intended, and drop values that fail to decode or exceed the server-side cap — sending truncated garbage is worse than sending nothing. Oversized-cookie guard matches the frontend's SIGNUP_SOURCE_MAX_LEN. * docs(analytics): reflect nth-issue drop, $pageview wiring, cookie encoding Pulls the schema doc back in line with the code: issue_executed no longer advertises nth_issue_for_workspace (with a note about why PostHog derives it at query time instead), the frontend $pageview section names the actual PageviewTracker component that fires it, and the signup_source section documents the per-value cap / overall drop rule and the encode-on-write / decode-on-read contract. --------- Co-authored-by: Jiang Bohan <bhjiang@outlook.com> |
||
|
|
a73336dcf8 |
feat(daemon): persistent UUID identity + legacy-id merge at register-time (#1220)
* feat(daemon): persistent UUID identity + legacy-id merge at register-time daemon_id is now a stable UUID persisted to `<profile-dir>/daemon.id` on first start, replacing the hostname-derived id that drifted whenever `.local` appeared/disappeared, a system was renamed, or a profile switched — each of which used to mint a fresh `agent_runtime` row and strand agents on the old one. To migrate existing installs without operator intervention, the daemon reports every legacy id it may have registered under previously (`host`, `host` with `.local` stripped, and `host[-profile]` variants for both). At register-time the server looks up each candidate row scoped to (workspace, provider), re-points its agents and tasks onto the new UUID-keyed row, records which legacy id was subsumed in the new `legacy_daemon_id` column for audit, and deletes the stale row. Result: users running `xxx.local`-keyed runtimes today transparently land on the new UUID row on next daemon restart. The hostname-prefix `MigrateAgentsToRuntime` / `daemon_id LIKE '...-%'` compatibility shim is no longer needed and has been removed along with the handler call that invoked it. * fix(daemon): handle bidirectional .local drift and case drift in legacy merge Review on #1220 flagged two gaps in the legacy-id migration candidate set: 1. Reverse .local: LegacyDaemonIDs only added the stripped variant when the current hostname ended in `.local`. The opposite direction — DB has `foo.local`, current host is `foo` — was missed, so runtimes registered under the `.local` variant stayed orphaned after upgrade. Now both variants (`foo` and `foo.local`) are always emitted, regardless of what `os.Hostname()` currently returns, plus their `-<profile>` suffix forms. 2. Case drift: os.Hostname() has been observed returning different casings on the same machine across mDNS/reboot state. A case-sensitive `=` comparison stranded rows like `Jiayuans-MacBook-Pro.local` when the daemon later reported `jiayuans-macbook-pro.local`. FindLegacyRuntimeByDaemonID now uses `LOWER(daemon_id) = LOWER(@daemon_id)` on both sides, so casing differences merge rather than orphan. The (workspace_id, provider) prefix still bounds the scan to a tiny set of rows so the non-indexed LOWER() comparison has negligible cost. Tests: TestLegacyDaemonIDs gets the mixed-case + reverse-direction cases; daemon_test.go adds TestDaemonRegister_MergesLegacyDaemonIDRuntime_ReverseDotLocal and TestDaemonRegister_MergesLegacyDaemonIDRuntime_CaseDrift. * fix(daemon): consolidate every case-duplicate legacy runtime, not just the first Follow-up review on #1220: after switching to `LOWER(daemon_id) = LOWER(@daemon_id)`, the single-row lookup still only merged one legacy row per candidate. If a machine already had two rows in the DB that differed only in casing (e.g. `Jiayuans-MacBook-Pro.local` AND `jiayuans-macbook-pro.local` coexisting because earlier hostname drift already minted a duplicate), only one of them got consolidated and the other stayed orphaned — violating the "no duplicate runtime per machine after backfill" acceptance. - FindLegacyRuntimeByDaemonID → FindLegacyRuntimesByDaemonID (:many) - mergeLegacyRuntimes iterates every returned row and dedupes across overlapping legacy candidates so `foo` and `foo.local` both resolving to the same stored row don't double-process Test: TestDaemonRegister_MergesAllCaseDuplicateLegacyRuntimes seeds two case-duplicate rows with one agent each and confirms both rows are deleted and both agents end up on the new UUID-keyed row. |
||
|
|
ff5f6ac2ee |
fix(daemon): prevent duplicate runtime registration on profile switch (#906)
* fix(daemon): prevent duplicate runtime registration on profile switch The daemon_id included a profile name suffix (e.g. "hostname-staging"), so switching profiles created a new daemon_id that bypassed the UPSERT dedup constraint, leaving orphaned runtime records in the database. Three changes: - Remove profile suffix from daemon_id — use stable hostname only. The unique constraint (workspace_id, daemon_id, provider) already prevents collisions within the same workspace. - Auto-migrate agents from old offline runtimes to the newly registered runtime during DaemonRegister (same workspace/provider/owner). - Add TTL-based GC in the runtime sweeper to delete offline runtimes with no active agents after 7 days. Closes MUL-695 * fix(daemon): address code review issues on PR #906 1. Move gcRuntimes() to the main sweep loop — previously it was inside sweepStaleRuntimes() after an early return, so it only ran when new runtimes were marked stale. Now it runs every sweep cycle independently. 2. Fix DeleteStaleOfflineRuntimes to exclude runtimes with ANY agent reference (not just active ones). The FK agent.runtime_id is ON DELETE RESTRICT, so archived agents also block deletion. 3. Scope MigrateAgentsToRuntime to the same machine by matching daemon_id LIKE '<current_daemon_id>-%'. This prevents cross-machine agent migration when the same user has multiple devices. |
||
|
|
6209e2f3ae |
fix(server): allow deleting runtimes when all bound agents are archived (#589)
Previously, runtimes could never be deleted once an agent was created because agents can only be archived (not deleted) and the count check included archived agents. Now the check only counts active agents, and archived agents are cleaned up before runtime deletion. |
||
|
|
ff27a249cc |
feat(runtime): add owner tracking, filtering, and delete (#535)
Add owner_id to agent_runtime table to track who registered each runtime. Backend: new delete endpoint with role-based permissions (owner/admin can delete any, members only their own), list filtering by owner (?owner=me), and agent dependency check before deletion. Frontend: Mine/All filter toggle in runtime list, owner display in list items and detail view, delete button with AlertDialog confirmation. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
b112d1f1ae |
feat(tasks): add coalescing queue and task lifecycle guards
- Coalescing queue: use HasPendingTaskForIssue (queued/dispatched only) instead of HasActiveTaskForIssue so comments during a running task enqueue exactly one follow-up task that picks up all new comments. - Stale task cleanup: runtime sweeper now fails orphaned tasks when their runtime goes offline (daemon crash/network partition). - Cancel-aware daemon: handleTask checks task status after execution and discards results if the task was cancelled mid-run (e.g. reassign). - Terminal issue guard: ClaimTaskForRuntime auto-cancels pending tasks for done/cancelled issues instead of executing them. - Race condition safety net: unique partial index ensures at most one pending task per issue at the DB level. |
||
|
|
b3bbf92a1d |
fix(runtime): add server-side sweeper to detect stale runtimes
The only path to marking a runtime offline was the daemon's deregister call on graceful shutdown. If the daemon crashed, was killed, or lost network, the status stayed "online" forever. Add a background goroutine that sweeps every 30s and marks runtimes offline after 45s without a heartbeat (3 missed intervals). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
38d595d81d |
feat(cli): restructure CLI commands for better UX
- Add top-level `multica login` that combines auth + workspace auto-discovery - Restructure daemon into subcommands: start, stop, status, logs - Add background daemon mode with PID management - Add daemon deregistration on shutdown (new API endpoint + SQL query) - Remove unused commands: runtime list, status, agent get/delete/stop - Make `config` show config directly instead of requiring `config show` - Update README to reflect new CLI structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
cdfa63af15 | feat(runtime): add local codex daemon pairing |