mirror of
https://github.com/multica-ai/multica.git
synced 2026-07-05 21:39:54 +02:00
* feat(agent-status): add workspace live-tasks endpoint and TaskFailureReason type Lays the API + type contract for the front-end agent presence cache: - New `GET /api/active-tasks` returns active (queued/dispatched/running) tasks plus failed tasks within the last 2 minutes for the current workspace. The 2-minute window powers a UI-side auto-clearing "Failed" agent state without back-end pollers. - `agent_task_queue` has no workspace_id column, so the query JOINs agent; `SELECT atq.*` keeps `failure_reason` (migration 055) on the wire. - Adds `TaskFailureReason` to `AgentTask` so the UI can map the 5 backend classifiers (agent_error / timeout / runtime_offline / runtime_recovery / manual) to copy without parsing free-text errors. - New `api.getActiveTasksForWorkspace()` client method; workspace is resolved server-side from the X-Workspace-Slug header (no path param, matching /api/agents and /api/runtimes conventions). Includes the joint engineering plan and designer brief that scope the broader Agent / Runtime status redesign — Phase 0 is this contract plus the front-end derivation layer landing in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agent-status): derive presence/health states with WS sync and desktop IPC bridge Adds the front-end derivation layer that turns raw server data into the user-facing 5-state agent / 4-state runtime enums. UI files are deliberately untouched in this commit — derivation lives behind hooks (useAgentPresence, useRuntimeHealth) that any component can call with zero additional network traffic. Architecture: - Derivation is pure functions in packages/core/{agents,runtimes}; the back-end stays free of UI translation. Agents algorithm: runtime offline > recent failed (2-min window) > running > queued > available. Runtimes algorithm: status + last_seen_at -> online / recently_lost / offline / about_to_gc. - A single workspace-wide active-tasks query backs all per-agent presence reads, eliminating N+1 across hover cards, list rows, and pickers. 30-second tick re-renders the hooks so the failed window expires even when no underlying data changes. - WS task lifecycle events (dispatch / completed / failed / cancelled) invalidate active-tasks via the prefix dispatcher. completed/failed were removed from specificEvents so they go through both the prefix invalidate and the existing chat ws.on() handlers. Reconnect refetch picks up active-tasks too. - Desktop bridges window.daemonAPI.onStatusChange directly into the runtimes cache via setQueryData, giving the local daemon sub-second feedback (vs. 75s server sweep). Bridge is wsId-bound so workspace switches automatically rebind the subscription; daemon_id matching covers the same-daemon-multiple-providers case. 24 derivation unit tests cover all branches plus null/empty/boundary inputs (FAILED_WINDOW_MS edges, null last_seen_at, missing completed_at). Full core suite: 112 tests passing. Typecheck green across all 8 workspace packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agent-status): redesign agent runtime status as two orthogonal dimensions Splits the conflated 5-state agent presence into two independent axes: - AgentAvailability (3-state): online / unstable / offline — drives the dot indicator everywhere a dot appears. Pure runtime reachability; never sticky-red because of a past task outcome. - LastTaskState (5-state): running / completed / failed / cancelled / idle — surfaced as text + icon on focused surfaces (hover card, agent detail page, agents list, runtime detail). Never colours the dot. Major changes: * Domain layer: AgentPresence union → AgentAvailability + LastTaskState. derive-presence split into deriveAgentAvailability + deriveLastTaskState + deriveAgentPresenceDetail orchestrator. Tests reorganised into three groups (availability invariants, last-task invariants, composition). * Visual config: presenceConfig (5 entries) → availabilityConfig (3) + taskStateConfig (5). availabilityOrder + lastTaskOrder for filter chips. * Workspace-level presence prefetch: new useWorkspacePresencePrefetch hook + WorkspacePresencePrefetch mount component, wired into DashboardLayout (web) and WorkspaceRouteLayout (desktop). Hover cards render synchronously with no skeleton flash on first hover. * ActorAvatar hover: flipped default — disableHoverCard removed, enableHoverCard added (default false). Opt-in at ~14 decision-moment surfaces; pickers / decoration sub-chips stay plain. Status dot decoupled (showStatusDot prop) so picker rows can show presence without nesting popovers. * Hover cards: AgentProfileCard simplified — availability dot only, Detail link top-right (logs live on the detail page). New MemberProfileCard mirrors the structure: name + role + email + top-2 owned agents (sorted by 30d run count) with click-through to agent detail. * Agents list: split Status into two columns — availability (3-color dot + label) and Last run (task icon + label, optional running counts). Two independent filter chip groups (Status + Last run); combination acts as intersection ("online + failed" finds broken- but-alive agents). * Other UI surfaces (issue list/board/detail, comments, autopilots, projects, runtimes, mention autocomplete, subscribers picker) updated to the new dot semantics; status dot now strictly 3-color. Server changes accompany the client redesign — workspace-wide agent-task-snapshot endpoint, runtime usage queries, etc. — to feed the derive layer with the data it needs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(agent-detail): drop last-task chip from detail header + inspector The Recent work section on the agent detail page already shows the same data (with task titles, timestamps, error context) — surfacing "Completed" / "Failed" / etc. up in the header was redundant chrome. Detail surfaces now show only the 3-state availability dot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tables): handle narrow viewports across agents / skills / runtimes Three table layouts were squeezing content into adjacent cells at intermediate widths. Each fix is small and targeted: * runtime-list: the Runtime cell's base name had `shrink-0`, so it refused to truncate when its grid column was narrowed under width pressure — the name visually overflowed into the Health column ("ClaudeOnline" etc). Removed shrink-0, added truncate. The Health column was also a fixed 9.5rem reservation for the worst-case "Recently lost · 2m 14s ago" copy; switched to minmax(0,1fr) so it competes fairly with Runtime. * skills-page: had a single grid template with no responsive breakpoints — all 6 columns were rendered at any width and got visually jammed below md. Added a <md template that drops Source + Updated; the row markup hides those cells via `hidden md:block` / `md:contents`. * agent-list-item: the new Last run column was reserved at minmax(8rem, max-content); on narrow md viewports the 8rem floor pushed the row past available width. Changed to minmax(0,max-content) so the cell shrinks under pressure (its content already truncates). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(agent-card): hover-only Detail + add Runtime row + breathing room Three small polish tweaks to the agent hover card: - Detail link gets `mr-1` + fades in only on card hover (group-hover). It was visually flush against the popover edge and competing for attention; now it stays out of the way during a quick glance and surfaces only when the user is dwelling on the card. - Runtime row is back, in the meta block (cloud/local icon + runtime name). The earlier removal was over-aggressive — knowing where an agent runs is part of "who is this agent". The wifi badge stays dropped because the availability dot in the header already conveys reachability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(runtime): wifi-style health icon (4-state) for runtime list + agent card Replaces the 6px coloured dot with a wifi-shape icon that carries both state (Wifi vs WifiOff) and severity (success/warning/muted/destructive). Mapping: - online → Wifi (success) - recently_lost → WifiHigh (warning) — transient hiccup, fewer bars - offline → WifiOff (muted) — long unreachable - about_to_gc → WifiOff (destructive) — sweeper coming soon Used in two places: - Runtime list: replaces HealthDot in the dedicated leading-icon column. Bumped the column from 0.5rem (dot-sized) to 0.875rem (icon-sized). - Agent profile card RuntimeRow: derives runtime health from runtime + clock (matching the 4-state semantics) and renders HealthIcon next to the runtime name. Cloud runtimes always read as online. The duplicate signal with the header availability dot is intentional — it confirms WHICH runtime is the one currently in the dot's state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
188 lines
6.8 KiB
Go
188 lines
6.8 KiB
Go
package handler
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
"net/http"
|
|
"net/http/httptest"
|
|
"testing"
|
|
)
|
|
|
|
// TestListWorkspaceAgentTaskSnapshot covers the agent presence snapshot endpoint:
|
|
// every active task (queued/dispatched/running) PLUS each agent's most recent
|
|
// OUTCOME task (completed/failed only). Cancelled tasks are excluded by design
|
|
// from the outcome half — they're a procedural signal, not an outcome, and
|
|
// must NOT mask a prior failure.
|
|
//
|
|
// The fixtures cover every branch the SQL must classify:
|
|
// - actives are always returned, no dedup
|
|
// - outcomes are deduped to "latest per agent" by completed_at
|
|
// - the OLD 2-minute window must be irrelevant (a 5-minute-old failure is
|
|
// still returned if it's the latest outcome)
|
|
// - cancelled rows are NEVER returned, even when they are temporally newer
|
|
// than a failure — this is what keeps the failed signal sticky after the
|
|
// user cancels their queued retry
|
|
func TestListWorkspaceAgentTaskSnapshot(t *testing.T) {
|
|
if testHandler == nil {
|
|
t.Skip("database not available")
|
|
}
|
|
|
|
ctx := context.Background()
|
|
// Three agents so we can verify per-agent semantics independently.
|
|
agentA := createHandlerTestAgent(t, "snapshot-agent-a", []byte(`{}`))
|
|
agentB := createHandlerTestAgent(t, "snapshot-agent-b", []byte(`{}`))
|
|
agentC := createHandlerTestAgent(t, "snapshot-agent-c", []byte(`{}`))
|
|
|
|
type taskFixture struct {
|
|
agentID string
|
|
status string
|
|
completedAt string // SQL expression; "" for NULL
|
|
label string
|
|
}
|
|
fixtures := []taskFixture{
|
|
// Agent A — actives + a newer completed supersedes an older failed.
|
|
{agentA, "queued", "", "A.queued"},
|
|
{agentA, "dispatched", "", "A.dispatched"},
|
|
{agentA, "running", "", "A.running"},
|
|
{agentA, "failed", "now() - interval '10 minutes'", "A.old_failed"},
|
|
{agentA, "completed", "now() - interval '30 seconds'", "A.latest_completed"},
|
|
|
|
// Agent B — old failure with no later outcome stays visible (no
|
|
// time window).
|
|
{agentB, "failed", "now() - interval '5 minutes'", "B.stale_failed_kept"},
|
|
|
|
// Agent C — failure followed by a NEWER cancelled. The cancelled
|
|
// must be skipped by the SQL filter so the failure remains visible.
|
|
// This is the scenario where a user fails, then cancels their
|
|
// queued retry to debug.
|
|
{agentC, "failed", "now() - interval '5 minutes'", "C.failure"},
|
|
{agentC, "cancelled", "now() - interval '30 seconds'", "C.newer_cancelled_must_be_ignored"},
|
|
}
|
|
|
|
insertedIDs := make([]string, 0, len(fixtures))
|
|
for _, f := range fixtures {
|
|
var id string
|
|
var query string
|
|
if f.completedAt == "" {
|
|
query = `INSERT INTO agent_task_queue (agent_id, runtime_id, status, priority)
|
|
VALUES ($1, $2, $3, 0) RETURNING id`
|
|
} else {
|
|
query = `INSERT INTO agent_task_queue (agent_id, runtime_id, status, priority, completed_at)
|
|
VALUES ($1, $2, $3, 0, ` + f.completedAt + `) RETURNING id`
|
|
}
|
|
if err := testPool.QueryRow(ctx, query, f.agentID, testRuntimeID, f.status).Scan(&id); err != nil {
|
|
t.Fatalf("insert %s: %v", f.label, err)
|
|
}
|
|
insertedIDs = append(insertedIDs, id)
|
|
}
|
|
t.Cleanup(func() {
|
|
for _, id := range insertedIDs {
|
|
testPool.Exec(ctx, `DELETE FROM agent_task_queue WHERE id = $1`, id)
|
|
}
|
|
})
|
|
|
|
w := httptest.NewRecorder()
|
|
req := newRequest(http.MethodGet, "/api/agent-task-snapshot", nil)
|
|
testHandler.ListWorkspaceAgentTaskSnapshot(w, req)
|
|
if w.Code != http.StatusOK {
|
|
t.Fatalf("ListWorkspaceAgentTaskSnapshot: expected 200, got %d: %s", w.Code, w.Body.String())
|
|
}
|
|
|
|
var tasks []AgentTaskResponse
|
|
if err := json.NewDecoder(w.Body).Decode(&tasks); err != nil {
|
|
t.Fatalf("decode response: %v", err)
|
|
}
|
|
|
|
// Per-agent breakdown so leftover tasks from other tests in this package
|
|
// don't pollute the assertions.
|
|
type key struct{ agent, status string }
|
|
counts := map[key]int{}
|
|
for _, task := range tasks {
|
|
if task.AgentID != agentA && task.AgentID != agentB && task.AgentID != agentC {
|
|
continue
|
|
}
|
|
counts[key{task.AgentID, task.Status}]++
|
|
}
|
|
|
|
wantCounts := map[key]int{
|
|
// Agent A: 3 actives + the latest outcome (completed). The older
|
|
// failed must be excluded by DISTINCT ON.
|
|
{agentA, "queued"}: 1,
|
|
{agentA, "dispatched"}: 1,
|
|
{agentA, "running"}: 1,
|
|
{agentA, "completed"}: 1,
|
|
// Agent B: just the failed outcome.
|
|
{agentB, "failed"}: 1,
|
|
// Agent C: the failed outcome must survive the temporally newer
|
|
// cancellation — that's the whole point of excluding cancelled
|
|
// from the outcome half.
|
|
{agentC, "failed"}: 1,
|
|
}
|
|
for k, expected := range wantCounts {
|
|
if got := counts[k]; got != expected {
|
|
t.Errorf("agent=%s status=%s: expected %d, got %d", k.agent, k.status, expected, got)
|
|
}
|
|
}
|
|
|
|
// The OLD failed terminal on agent A must be excluded.
|
|
if counts[key{agentA, "failed"}] != 0 {
|
|
t.Errorf("agent A old failed must be superseded by newer completed; got %d", counts[key{agentA, "failed"}])
|
|
}
|
|
|
|
// No cancelled row may ever appear in the snapshot — they're filtered at
|
|
// SQL level so the front-end's "cancel doesn't mask failure" rule lands
|
|
// without any front-end logic.
|
|
for _, agentID := range []string{agentA, agentB, agentC} {
|
|
if counts[key{agentID, "cancelled"}] != 0 {
|
|
t.Errorf("agent %s: cancelled rows must be excluded from snapshot; got %d",
|
|
agentID, counts[key{agentID, "cancelled"}])
|
|
}
|
|
}
|
|
}
|
|
|
|
func TestCreateAgent_RejectsDuplicateName(t *testing.T) {
|
|
if testHandler == nil {
|
|
t.Skip("database not available")
|
|
}
|
|
|
|
// Clean up any agents created by this test.
|
|
t.Cleanup(func() {
|
|
testPool.Exec(context.Background(),
|
|
`DELETE FROM agent WHERE workspace_id = $1 AND name = $2`,
|
|
testWorkspaceID, "duplicate-name-test-agent",
|
|
)
|
|
})
|
|
|
|
body := map[string]any{
|
|
"name": "duplicate-name-test-agent",
|
|
"description": "first description",
|
|
"runtime_id": testRuntimeID,
|
|
"visibility": "private",
|
|
"max_concurrent_tasks": 1,
|
|
}
|
|
|
|
// First call — creates the agent.
|
|
w1 := httptest.NewRecorder()
|
|
testHandler.CreateAgent(w1, newRequest(http.MethodPost, "/api/agents", body))
|
|
if w1.Code != http.StatusCreated {
|
|
t.Fatalf("first CreateAgent: expected 201, got %d: %s", w1.Code, w1.Body.String())
|
|
}
|
|
var resp1 map[string]any
|
|
if err := json.NewDecoder(w1.Body).Decode(&resp1); err != nil {
|
|
t.Fatalf("decode first response: %v", err)
|
|
}
|
|
agentID1, _ := resp1["id"].(string)
|
|
if agentID1 == "" {
|
|
t.Fatalf("first CreateAgent: no id in response: %v", resp1)
|
|
}
|
|
|
|
// Second call — same name must be rejected with 409 Conflict.
|
|
// The unique constraint prevents silent duplicates; the UI shows a clear error.
|
|
body["description"] = "updated description"
|
|
w2 := httptest.NewRecorder()
|
|
testHandler.CreateAgent(w2, newRequest(http.MethodPost, "/api/agents", body))
|
|
if w2.Code != http.StatusConflict {
|
|
t.Fatalf("second CreateAgent with duplicate name: expected 409, got %d: %s", w2.Code, w2.Body.String())
|
|
}
|
|
}
|