mirror of
https://github.com/multica-ai/multica.git
synced 2026-06-17 03:38:32 +02:00
* feat(runtimes): weekly usage dimension + tz-aware aggregation (MUL-2382) Adds a Weekly view to the runtime Usage chart alongside Daily and Hourly, backed by `aggregateByWeek` on the existing 180-day daily cache (no new endpoint). Weeks are ISO 8601 Mon–Sun; the in-progress week is rendered at half opacity and tooltip-labelled "partial · N / 7 days". Side effects called out in the RFC: - `sliceWindow` now reads "today" in the runtime's IANA timezone, fixing a one-day drift at the window edge when the browser and runtime sit in different time zones. - ActivityHeatmap rows are reordered Mon → Sun to match the rest of the Weekly aggregation; "today" is computed in runtime tz so the grid's trailing column lines up with the daily rows the backend buckets. Dimension / period coupling: switching dimension resets the period to that dimension's default when the active value isn't in its allowed set (Hourly 7/30, Daily 7/30/90, Weekly 30/90/180). Unit tests cover weekStart / addDays / tz-aware today, the sliceWindow boundary, and aggregateByWeek's partial-week math. Co-authored-by: multica-agent <github@multica.ai> * fix(runtimes): weekly chart shows trailing calendar weeks (MUL-2382) aggregateByWeek built one bucket per week-with-data, and the caller took the last N buckets. With sparse data — old populated weeks plus empty stretches near today — the slice surfaced the old weeks instead of the trailing in-window calendar weeks the user selected. Now aggregateByWeek takes weekCount and emits exactly that many trailing calendar weeks anchored at today's week in the runtime tz. Buckets are pre-zeroed so empty in-range weeks render as empty bars; rows outside the window are dropped. Co-authored-by: multica-agent <github@multica.ai> * feat(usage): drop Hourly dim + add Daily/Weekly to workspace dashboard (MUL-2382) - Remove Hourly from the runtime usage WHEN-chart: segmented control is now Daily / Weekly. Drop the HourlyActivityChart component, aggregateCostByHour helper, byHour query subscription, and the when_tab_hourly i18n key. - Add the same Daily / Weekly dimension toggle to the workspace-level Usage page (dashboard-page.tsx). Time-range linkage matches the runtime page: Daily allows 7/30/90 (default 30), Weekly allows 30/90/180 (default 90); switching dimensions resets `days` when the current value isn't in the new dimension's set. - Reuse `aggregateByWeek` from runtimes/utils for cost / tokens (signature relaxed to accept the wider DashboardUsageDaily shape). Add `aggregateWeeklyTime` / `aggregateWeeklyTasks` in dashboard/utils with identical pre-zeroed trailing-week semantics. Workspace dashboard uses the user-chosen timezone (existing TimezoneSelect) as the week-boundary tz; runtime page continues to use the runtime's IANA tz. - New `WeeklyTimeChart` / `WeeklyTasksChart` mirror their daily counterparts plus partial-week half-opacity bars and rangeLabel tooltips, matching the existing Weekly cost / tokens charts. - Tests: drop hourly-related setup; add weekly run-time / tasks coverage asserting pre-zeroed trailing buckets and the same MUL-2382 sparse window-scoping regression we caught on the runtime side. Co-authored-by: multica-agent <github@multica.ai> * fix(usage): correct workspace Weekly window + lock tz to UTC (MUL-2382) Two blocking correctness bugs from Emacs's PR #2822 review: 1. The Weekly chart paints `ceil(days/7)` trailing calendar weeks but the API was still asked for exactly `days`. Worst case (today = Sunday on a 30D request) the leftmost Monday sits 34 days back, so the first week's bucket was silently truncated. Over-fetch the per-date queries to `weekCount * 7` days when Weekly is active; per-agent rollups stay at `days` so the KPI / leaderboard labels keep their advertised window. Daily-aggregation surfaces (cost/tokens/time/tasks KPIs and the Daily chart) re-scope the over-fetched rows back to `days` so the labels stay consistent. 2. The backend dashboard rollup buckets data by UTC `bucket_date` (and the raw fallback queries by `DATE(tu.created_at)`, also UTC), but the frontend was driving Weekly boundaries from the user-chosen `TimezoneSelect`. Near midnight UTC that put cross-boundary rows into the wrong calendar week. Lock workspace Weekly to UTC and remove the timezone picker from this page; the runtime detail page keeps its own `runtime.timezone`-anchored aggregation, which is consistent because its rollup is materialized in that runtime's tz. Verification: pnpm --filter @multica/views test (636 passed), typecheck clean, lint 0 errors / 13 pre-existing warnings. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>
308 lines
10 KiB
TypeScript
308 lines
10 KiB
TypeScript
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
|
||
import {
|
||
aggregateAgentTokens,
|
||
aggregateDailyCost,
|
||
aggregateWeeklyTasks,
|
||
aggregateWeeklyTime,
|
||
computeDailyTotals,
|
||
formatDuration,
|
||
mergeAgentDashboardRows,
|
||
} from "./utils";
|
||
|
||
describe("aggregateDailyCost", () => {
|
||
it("collapses multiple rows per day into one stack and sorts by date asc", () => {
|
||
const result = aggregateDailyCost([
|
||
{
|
||
date: "2026-05-10",
|
||
model: "claude-sonnet-4-6",
|
||
input_tokens: 1_000_000,
|
||
output_tokens: 500_000,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 3,
|
||
},
|
||
{
|
||
date: "2026-05-09",
|
||
model: "claude-sonnet-4-6",
|
||
input_tokens: 1_000_000,
|
||
output_tokens: 0,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 1,
|
||
},
|
||
]);
|
||
|
||
// Sort: oldest day first.
|
||
expect(result.map((r) => r.date)).toEqual(["2026-05-09", "2026-05-10"]);
|
||
// claude-sonnet-4-6: input $3/M, output $15/M.
|
||
// 2026-05-09 → 1M input × $3 = $3 input, $0 output, $0 cache.
|
||
expect(result[0]).toMatchObject({ input: 3, output: 0, cacheWrite: 0, total: 3 });
|
||
// 2026-05-10 → $3 input + (0.5M × $15) = $7.5 output. Total $10.5.
|
||
expect(result[1]).toMatchObject({ input: 3, output: 7.5, cacheWrite: 0, total: 10.5 });
|
||
});
|
||
|
||
it("treats unmapped models as zero-cost", () => {
|
||
const result = aggregateDailyCost([
|
||
{
|
||
date: "2026-05-10",
|
||
model: "made-up-model",
|
||
input_tokens: 999_999_999,
|
||
output_tokens: 0,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 0,
|
||
},
|
||
]);
|
||
expect(result[0]?.total).toBe(0);
|
||
});
|
||
});
|
||
|
||
describe("aggregateAgentTokens", () => {
|
||
it("folds per-(agent, model) rows into per-agent totals and sorts by cost desc", () => {
|
||
const rows = aggregateAgentTokens([
|
||
{
|
||
agent_id: "small-spender",
|
||
model: "claude-sonnet-4-6",
|
||
input_tokens: 100_000,
|
||
output_tokens: 0,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 1,
|
||
},
|
||
{
|
||
agent_id: "big-spender",
|
||
model: "claude-sonnet-4-6",
|
||
input_tokens: 5_000_000,
|
||
output_tokens: 0,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 3,
|
||
},
|
||
{
|
||
agent_id: "big-spender",
|
||
model: "claude-haiku-4-5",
|
||
input_tokens: 1_000_000,
|
||
output_tokens: 0,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 2,
|
||
},
|
||
]);
|
||
|
||
expect(rows.map((r) => r.agentId)).toEqual(["big-spender", "small-spender"]);
|
||
expect(rows[0]?.taskCount).toBe(5);
|
||
// big-spender across two models — verify cost > small-spender's.
|
||
expect(rows[0]!.cost).toBeGreaterThan(rows[1]!.cost);
|
||
});
|
||
});
|
||
|
||
describe("computeDailyTotals", () => {
|
||
it("sums tokens across rows and adds estimated cost", () => {
|
||
const totals = computeDailyTotals([
|
||
{
|
||
date: "2026-05-10",
|
||
model: "claude-sonnet-4-6",
|
||
input_tokens: 1_000_000,
|
||
output_tokens: 0,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 2,
|
||
},
|
||
{
|
||
date: "2026-05-09",
|
||
model: "claude-sonnet-4-6",
|
||
input_tokens: 2_000_000,
|
||
output_tokens: 0,
|
||
cache_read_tokens: 0,
|
||
cache_write_tokens: 0,
|
||
task_count: 3,
|
||
},
|
||
]);
|
||
expect(totals.input).toBe(3_000_000);
|
||
expect(totals.cost).toBe(9); // 3M × $3/M
|
||
expect(totals.taskCount).toBe(5);
|
||
});
|
||
});
|
||
|
||
describe("mergeAgentDashboardRows", () => {
|
||
it("uses run-time rollup's per-agent task count, not the token sum", () => {
|
||
// Token rollup returns two (agent, model) rows for the same task
|
||
// (the agent ran one task that touched two models). The token-side
|
||
// aggregator sums per-row task_count and lands at 2; the run-time
|
||
// rollup correctly reports the underlying distinct count of 1.
|
||
const tokenRows = [
|
||
{
|
||
agentId: "agent-a",
|
||
tokens: 3_000_000,
|
||
cost: 12,
|
||
taskCount: 2, // overcounted because (model-1: 1) + (model-2: 1)
|
||
},
|
||
];
|
||
const runTimeRows = [
|
||
{
|
||
agent_id: "agent-a",
|
||
total_seconds: 600,
|
||
task_count: 1, // truth: one task touched both models
|
||
failed_count: 0,
|
||
},
|
||
];
|
||
const merged = mergeAgentDashboardRows(tokenRows, runTimeRows);
|
||
expect(merged).toHaveLength(1);
|
||
expect(merged[0]!.taskCount).toBe(1);
|
||
expect(merged[0]!.seconds).toBe(600);
|
||
});
|
||
|
||
it("falls back to token count when no run-time row exists (in-flight task)", () => {
|
||
// Tokens reported mid-run; task hasn't terminated yet so the run-time
|
||
// rollup is silent on this agent. Keep the token-side estimate
|
||
// instead of dropping the agent from the table entirely.
|
||
const merged = mergeAgentDashboardRows(
|
||
[{ agentId: "agent-b", tokens: 100, cost: 0.5, taskCount: 1 }],
|
||
[],
|
||
);
|
||
expect(merged[0]!.taskCount).toBe(1);
|
||
expect(merged[0]!.seconds).toBe(0);
|
||
});
|
||
|
||
it("includes agents that have run-time but no tokens", () => {
|
||
// Task errored before reporting any usage — run-time row exists but
|
||
// there's no corresponding token row. Agent must still appear on the
|
||
// list with zeroed-out token columns.
|
||
const merged = mergeAgentDashboardRows(
|
||
[],
|
||
[{ agent_id: "agent-c", total_seconds: 30, task_count: 1, failed_count: 1 }],
|
||
);
|
||
expect(merged).toHaveLength(1);
|
||
expect(merged[0]!.tokens).toBe(0);
|
||
expect(merged[0]!.cost).toBe(0);
|
||
expect(merged[0]!.taskCount).toBe(1);
|
||
});
|
||
|
||
it("sorts by cost desc with run-time as a tiebreaker", () => {
|
||
const merged = mergeAgentDashboardRows(
|
||
[
|
||
{ agentId: "low", tokens: 100, cost: 1, taskCount: 1 },
|
||
{ agentId: "high", tokens: 100, cost: 9, taskCount: 1 },
|
||
{ agentId: "zero-cost-long", tokens: 0, cost: 0, taskCount: 0 },
|
||
],
|
||
[
|
||
{ agent_id: "zero-cost-long", total_seconds: 1000, task_count: 5, failed_count: 0 },
|
||
],
|
||
);
|
||
expect(merged.map((r) => r.agentId)).toEqual(["high", "low", "zero-cost-long"]);
|
||
});
|
||
});
|
||
|
||
describe("formatDuration", () => {
|
||
it("formats seconds-only durations", () => {
|
||
expect(formatDuration(45, "<1m")).toBe("45s");
|
||
});
|
||
it("formats minutes and seconds when under one hour", () => {
|
||
expect(formatDuration(150, "<1m")).toBe("2m 30s");
|
||
expect(formatDuration(60, "<1m")).toBe("1m");
|
||
});
|
||
it("formats hours and minutes when under one day", () => {
|
||
expect(formatDuration(3 * 3600 + 17 * 60, "<1m")).toBe("3h 17m");
|
||
expect(formatDuration(3600, "<1m")).toBe("1h");
|
||
});
|
||
it("formats days and hours when more than 24 hours", () => {
|
||
expect(formatDuration(2 * 86400 + 5 * 3600, "<1m")).toBe("2d 5h");
|
||
});
|
||
it("falls back to the supplied label for sub-second durations", () => {
|
||
expect(formatDuration(0, "<1m")).toBe("<1m");
|
||
expect(formatDuration(0.4, "<1m")).toBe("<1m");
|
||
});
|
||
});
|
||
|
||
// ---------------------------------------------------------------------------
|
||
// Weekly run-time / tasks aggregation. Mirrors the runtimes-side
|
||
// aggregateByWeek tests: trailing N calendar weeks anchored at today-in-tz,
|
||
// pre-zeroed buckets, partial-week metadata, and rows outside the window
|
||
// dropped. We assert the same invariants on the workspace dashboard helpers
|
||
// so all four metrics behave consistently when the user toggles Weekly.
|
||
// ---------------------------------------------------------------------------
|
||
|
||
describe("aggregateWeeklyTime", () => {
|
||
beforeEach(() => {
|
||
vi.useFakeTimers();
|
||
});
|
||
afterEach(() => {
|
||
vi.useRealTimers();
|
||
});
|
||
|
||
it("folds per-day run-time rows into Mon-anchored weekly totals", () => {
|
||
// 2026-05-19 is a Tuesday → current week is Mon=05-18..Sun=05-24.
|
||
vi.setSystemTime(new Date("2026-05-19T12:00:00Z"));
|
||
const rows = [
|
||
{ date: "2026-05-11", total_seconds: 100, task_count: 0, failed_count: 0 },
|
||
{ date: "2026-05-17", total_seconds: 50, task_count: 0, failed_count: 0 },
|
||
{ date: "2026-05-18", total_seconds: 25, task_count: 0, failed_count: 0 },
|
||
];
|
||
const result = aggregateWeeklyTime(rows, "UTC", 2);
|
||
expect(result).toHaveLength(2);
|
||
expect(result[0]).toMatchObject({
|
||
weekStart: "2026-05-11",
|
||
weekEnd: "2026-05-17",
|
||
totalSeconds: 150,
|
||
partial: false,
|
||
daysCovered: 7,
|
||
});
|
||
expect(result[1]).toMatchObject({
|
||
weekStart: "2026-05-18",
|
||
totalSeconds: 25,
|
||
partial: true,
|
||
daysCovered: 2, // Mon + Tue
|
||
});
|
||
});
|
||
|
||
it("drops rows that fall outside the trailing window and keeps empty buckets", () => {
|
||
// Same MUL-2382 sparse-data regression we caught on the runtimes side:
|
||
// an old populated week must not surface when the requested window
|
||
// doesn't include it; in-range empty weeks must remain as zero buckets.
|
||
vi.setSystemTime(new Date("2026-05-19T12:00:00Z"));
|
||
const rows = [
|
||
// 2026-04-13 is a Monday — exactly one week earlier than the oldest
|
||
// in-range week (Mon=04-20) for a 5-week trailing window.
|
||
{ date: "2026-04-13", total_seconds: 999, task_count: 0, failed_count: 0 },
|
||
];
|
||
const result = aggregateWeeklyTime(rows, "UTC", 5);
|
||
expect(result.map((w) => w.weekStart)).toEqual([
|
||
"2026-04-20",
|
||
"2026-04-27",
|
||
"2026-05-04",
|
||
"2026-05-11",
|
||
"2026-05-18",
|
||
]);
|
||
for (const w of result) expect(w.totalSeconds).toBe(0);
|
||
});
|
||
});
|
||
|
||
describe("aggregateWeeklyTasks", () => {
|
||
beforeEach(() => {
|
||
vi.useFakeTimers();
|
||
});
|
||
afterEach(() => {
|
||
vi.useRealTimers();
|
||
});
|
||
|
||
it("splits completed and failed counts per calendar week", () => {
|
||
vi.setSystemTime(new Date("2026-05-19T12:00:00Z"));
|
||
const rows = [
|
||
{ date: "2026-05-12", total_seconds: 0, task_count: 5, failed_count: 1 },
|
||
{ date: "2026-05-18", total_seconds: 0, task_count: 3, failed_count: 0 },
|
||
];
|
||
const result = aggregateWeeklyTasks(rows, "UTC", 2);
|
||
expect(result[0]).toMatchObject({
|
||
weekStart: "2026-05-11",
|
||
completed: 4,
|
||
failed: 1,
|
||
});
|
||
expect(result[1]).toMatchObject({
|
||
weekStart: "2026-05-18",
|
||
completed: 3,
|
||
failed: 0,
|
||
partial: true,
|
||
});
|
||
});
|
||
});
|