Files
multica/server/internal/analytics/events_test.go
Bohan Jiang b92c3fbc93 chore(analytics): stop shipping operational events to PostHog (MUL-2967) (#3720)
* chore(analytics): stop shipping operational events to PostHog (MUL-2967)

Operational / execution-lifecycle telemetry dominated PostHog event volume
and drove the bill: runtime_offline alone was ~54% of ~22.6M events/mo, and
~99% of events were billed at the higher identified-event rate. These signals
already have Prometheus counters (Grafana), so the PostHog copies were
redundant cost.

- Add analytics.IsMetricsOnly; metrics.RecordEvent now skips the PostHog
  Capture for runtime_* and autopilot_run_* while still incrementing their
  Prometheus counter (their analytics.Event constructors are retained to feed
  the metric label set via IncForEvent).
- Remove the agent_task_* PostHog path entirely: drop captureTaskEvent and the
  AgentTask* constructors/constants. Their Prometheus side is unchanged via the
  typed BusinessMetrics.RecordTask* methods. Also remove the now-dead
  taskDurationMS / willRetryTask helpers.
- Update the pairing lint test (no agent_task allow-list, no naked-Capture
  exception), add a RecordEvent skip test + IsMetricsOnly test, and update
  docs/analytics.md (taxonomy, per-event banners, reconciliation).

Product/funnel events (signup, onboarding, issue_created, issue_executed,
chat_message_sent, agent_created, autopilot_created, etc.) are unchanged and
still ship to PostHog.

Co-authored-by: multica-agent <github@multica.ai>

* docs(analytics): correct agent_task Prometheus metric contract (MUL-2967)

Address PR review: the agent_task_* "Prometheus-only" banner claimed the old
PostHog event properties (task_id, agent_id, duration_ms, error_type,
will_retry, ...) were the metric label set. They are not — the real labels are
only source/runtime_mode/provider/terminal_status/failure_reason.

- Replace the agent_task_* sections with the actual metric names and labels
  (multica_agent_task_*; see business.go / labels.go), and explain that
  completed/failed/cancelled are terminal_status values on
  multica_agent_task_terminal_total, with wall-clock in the *_seconds
  histograms.
- Tighten the runtime_*/autopilot_run_* banners so id properties aren't
  mistaken for labels.
- Drop the stale AgentTask allow-list reference from the pairing lint test
  header comment.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-04 00:48:17 +08:00

48 lines
1.8 KiB
Go

package analytics
import "testing"
func TestRuntimeReadyOmitsUnmeasuredDuration(t *testing.T) {
ev := RuntimeReady("user-1", "workspace-1", "runtime-1", "daemon-1", "codex", 0)
if _, ok := ev.Properties["ready_duration_ms"]; ok {
t.Fatalf("ready_duration_ms should be omitted until it is measured")
}
ev = RuntimeReady("user-1", "workspace-1", "runtime-1", "daemon-1", "codex", 123)
if got := ev.Properties["ready_duration_ms"]; got != int64(123) {
t.Fatalf("ready_duration_ms = %v, want 123", got)
}
}
func TestFailedEventsUseWillRetry(t *testing.T) {
runEv := AutopilotRunFailed("user-1", "workspace-1", "autopilot-1", "run-1", "manual", AutopilotAssignee{AgentID: "agent-1", AssigneeType: "agent"}, "manual", "task failed", "task_error", false, 10)
if got := runEv.Properties["will_retry"]; got != false {
t.Fatalf("autopilot will_retry = %v, want false", got)
}
if _, ok := runEv.Properties["recoverable"]; ok {
t.Fatalf("autopilot failure should not emit recoverable")
}
}
func TestIsMetricsOnly(t *testing.T) {
// Operational / execution-lifecycle events are Prometheus-only and must
// not be shipped to PostHog.
for _, name := range []string{
EventRuntimeRegistered, EventRuntimeReady, EventRuntimeFailed, EventRuntimeOffline,
EventAutopilotRunStarted, EventAutopilotRunCompleted, EventAutopilotRunFailed,
} {
if !IsMetricsOnly(name) {
t.Errorf("IsMetricsOnly(%q) = false, want true (operational event must stay out of PostHog)", name)
}
}
// Product-behaviour events must still reach PostHog.
for _, name := range []string{
EventSignup, EventWorkspaceCreated, EventIssueCreated, EventIssueExecuted,
EventChatMessageSent, EventAgentCreated, EventAutopilotCreated,
} {
if IsMetricsOnly(name) {
t.Errorf("IsMetricsOnly(%q) = true, want false (product event must reach PostHog)", name)
}
}
}