3 Commits

Author SHA1 Message Date
Bohan Jiang
6b70146570 test(rollup): serialise shared-singleton rollup tests across packages (MUL-3980) (#4854)
`go test ./...` compiles internal/handler and internal/scheduler into
separate binaries and runs them in parallel against the same DATABASE_URL.
Both mutate the global task_usage_hourly_rollup_state singleton (id=1) and
contend for the rollup function's advisory lock 4246, so under `-race` on CI
they interleave and fail flakily:

  - TestRollupTaskUsageHourlyCapsWindowAtOneDay reads the scheduler test's
    forced-back watermark (0.063 days ≈ the scheduler's now-90min) instead of
    "now".
  - TestPgCronConcurrentNoDoubleWrite sees a handler rollup tick advance the
    watermark past its window, yielding winners=0.

Add a dedicated session-level advisory lock (42463980, distinct from the
function's own 4246) that every test touching the singleton acquires for its
duration, serialising them across test processes. Reproduced the exact CI
failures on a concurrent stress loop (5/5 rounds) and confirmed the guard
eliminates them (8/8 rounds green).

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-07-02 18:31:45 +08:00
YYClaw
614dfae884 MUL-2488 feat(timezone): Scheduling / Viewing two-layer timezone architecture (#2968)
* docs(timezone): add scheduling/viewing timezone architecture RFC

* feat(db): replace daily rollups with task_usage_hourly, add user.timezone

Migrations 100-104: add "user".timezone (Viewing tz), build the UTC
hourly task_usage_hourly rollup with its pipeline, drop the legacy
task_usage_daily / task_usage_dashboard_daily pipelines, and drop the
agent_runtime.timezone column. Report queries now slice day boundaries
at read time by the caller-supplied @tz instead of materialising in a
fixed tz. Regenerate sqlc.

* feat(server): add task_usage_hourly backfill command

Replace the two legacy backfill commands (daily / dashboard_daily) with
a single backfill_task_usage_hourly that loads historical task_usage
into the new UTC hourly rollup, sliced per workspace.

* refactor(server): resolve viewing timezone in report handlers

Report handlers resolve the Viewing tz per request (?tz query param,
then user.timezone, then UTC) and pass it to the hourly-rollup queries.
Drop the UseDailyRollup feature flags and the old raw-scan/daily-rollup
dual paths, remove the /api/usage endpoints, and stop the daemon from
reporting and the runtime handler from accepting host timezone.

* refactor(core): switch report queries to viewing timezone

API client and dashboard/runtime queries send ?tz with each report
request, the user schema/types carry the new timezone field, and the
runtime timezone field/mutation is removed.

* feat(views): add viewing timezone preference and UI

Add the useViewingTimezone hook and a Timezone setting in Preferences;
report charts and the dashboard week boundary follow the viewer tz.
Remove the runtime detail timezone editor and its locale strings.

* fix(test): update fixtures and stabilize tests for timezone refactor

The timezone architecture refactor changed several types without
updating dependent test code:

- RuntimeDevice no longer has a timezone field — drop it from the
  create-agent-dialog runtime fixture.
- User now requires a timezone field — add it to the apps/web mockUser
  fixture.
- The PreferencesTab timezone tests asserted on the async save handler
  (PATCH then store update) with a bare expect, racing the mutation's
  settle callback, and timed out querying the Select's ~600-option IANA
  list on a loaded CI runner. Wrap the assertions in waitFor and extend
  the timeout for those three tests.

* docs(timezone): document self-host migration order and trigger invariant

Add a SELF-HOST UPGRADE ORDER runbook to the backfill command's package
comment: applying migrations 100-104 in a single migrate-up drops the
legacy daily rollups before the hourly backfill runs, leaving dashboards
empty until cron catches up.

Add an INVARIANT comment on trg_atq_dirty_hourly noting that agent_id
must be added to the trigger's OF list if it ever becomes mutable,
otherwise dirty buckets for the old agent_id are silently missed.

* style(runtimes): drop trailing blank line in runtime-detail
2026-05-21 15:33:47 +08:00
Bohan Jiang
96695a79c5 feat(dashboard): workspace/project token + run-time dashboard MUL-1882 (#2462)
* feat(dashboard): workspace/project token + run-time dashboard

Add a `/{slug}/dashboard` page showing per-agent token spend and execution
time across the whole workspace, with an optional project filter.

Backend:
  - Three new sqlc queries against task_usage + agent_task_queue: daily
    usage, per-agent usage, per-agent total run-time. All optionally
    scoped to a project via sqlc.narg('project_id'), reaching project
    through the issue join.
  - Handlers under /api/dashboard return the same wire shape the runtime
    page already consumes (model preserved for client-side cost math).

Frontend: - Shared DashboardPage in packages/views/dashboard reusing KpiCard,
    DailyCostChart, ActorAvatar, and estimateCost from the runtime page
    so the visual style and pricing math stay in lock-step.
  - Period selector (7/30/90d), project dropdown, four KPI tiles
    (cost, tokens, run time, tasks), daily cost chart, and a combined
    "cost + run time by agent" list.
  - Routed in both web (app/[slug]/(dashboard)/dashboard) and desktop
    (memory router); sidebar nav entry added under Workspace group.
Co-authored-by: multica-agent <github@multica.ai>

* fix(dashboard): drop stale project filter and stop double-counting tasks

Two issues caught in PR #2462 review:

1. Project filter held the previous selection's UUID across workspace
   switches and project deletions: the dropdown gracefully showed
   "All projects" (because the title lookup missed) while the three
   dashboard queries kept forwarding the dead UUID, leaving the UI
   looking like a full-workspace view but populated with empty
   project-scoped data. Validate the picked UUID against the current
   projects list before passing it to the queries.

2. The "by agent" table read its task count from the token rollup,
   which is grouped per (agent, model). A single task that spans two
   models lands twice and the agent's row reads e.g. "2 tasks" when
   the real count is 1. Prefer `ListDashboardAgentRunTime`'s per-agent
   distinct count when available; fall back to the token aggregate
   only for agents with no terminal run yet (in-flight tasks).

Extract the merge into `mergeAgentDashboardRows` so the precedence
rules are unit-tested directly.

Co-authored-by: multica-agent <github@multica.ai>

* test(dashboard): allocate per-workspace issue.number explicitly

TestDashboardEndpoints creates two issues in the shared fixture
workspace. issue.number defaults to 0 (migration 020), and the table
carries UNIQUE (workspace_id, number), so the second insert raced the
first on the same default and failed in CI.

Allocate MAX(number) + 1 per insert so each row gets a fresh number
without stepping on rows other tests left behind in the same workspace.

Co-authored-by: multica-agent <github@multica.ai>

* feat(dashboard): rollup table + cron-driven aggregation for dashboard

Mirror the per-runtime rollup in `task_usage_daily` (migrations 073/077/082)
to remove the per-request raw aggregation the dashboard was doing.

Migration 084 adds:
  - `task_usage_dashboard_daily` keyed on
    (bucket_date, workspace_id, agent_id, project_id, model) — the
    dimensions the dashboard actually queries, with project_id nullable
    via UNIQUE NULLS NOT DISTINCT (PG15+) so "no-project" buckets
    upsert cleanly.
  - `task_usage_dashboard_rollup_state` watermark table.
  - `task_usage_dashboard_dirty` invalidation queue.
  - Triggers on agent_task_queue DELETE, task_usage DELETE, and
    issue.project_id UPDATE — the cases the updated_at watermark can't
    see. The project_id trigger re-attributes existing rollup rows when
    a user moves an issue across projects.
  - `rollup_task_usage_dashboard_daily_window(from, to)` —
    idempotent recompute primitive (same shape as 077).
  - `rollup_task_usage_dashboard_daily()` cron entry — own advisory
    lock (4244) so it serialises independently of the runtime rollup.
  - `task_usage_dashboard_rollup_lag_seconds()` health helper.

Sqlc queries `ListDashboardUsageDailyRollup` /
`ListDashboardUsageByAgentRollup` read from the new table; the handler
dispatches between rollup and raw on a separate
`UseDailyRollupForDashboard` config flag
(`USAGE_DASHBOARD_ROLLUP_ENABLED` env). Same fail-safe default (false →
raw) so operators can roll out independently of the per-runtime flag.

Bucket date is UTC (the dashboard aggregates across runtimes that may
sit in different tzs; there's no single correct local boundary).

Adds `cmd/backfill_task_usage_dashboard_daily` mirroring the existing
per-runtime backfill — operator runs it once before flipping the flag.

Tests: - TestDashboardEndpoints now also exercises the rollup read path
    (raw vs. rollup, same project-scoped totals).
  - TestDashboardRollupReattributesOnProjectChange verifies the
    issue.project_id trigger enqueues both old + new buckets and the
    next rollup tick zeroes the old project + populates the new one.
Co-authored-by: multica-agent <github@multica.ai>

* fix(dashboard-rollup): close two invalidation gaps

Two leak paths missed by migration 084 review:

1. Issue cascade DELETE — the atq BEFORE DELETE trigger runs AFTER the
   issue row is gone, so `LEFT JOIN issue` returns NULL project_id and
   the original-project bucket never gets cleared (issue 077 calls this
   out for the runtime rollup but didn't need to act on it). Adds an
   `issue BEFORE DELETE` trigger that enqueues using OLD.project_id
   while the issue row is still readable.

2. `LinkTaskToIssue` (quick-create task attaching to a real issue post-
   completion) UPDATEs `agent_task_queue.issue_id` from NULL to a real
   id. Migration 084 only watched DELETE on atq, so usage already
   rolled up under the no-project bucket stayed attributed to NULL
   forever. Extends the atq trigger to fire on UPDATE OF issue_id too,
   enqueueing both OLD (NULL project) and NEW (linked issue's project).

Tests: - TestDashboardRollupClearsOnIssueDelete asserts rollup row drops to
    zero after issue delete + rollup tick.
  - TestDashboardRollupReattributesOnLinkTaskToIssue verifies tokens
    move from the NULL bucket to the project bucket after the UPDATE.
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 12:51:16 +08:00