docs(analytics): correct agent_task Prometheus metric contract (MUL-2967)

Address PR review: the agent_task_* "Prometheus-only" banner claimed the old PostHog event properties (task_id, agent_id, duration_ms, error_type, will_retry, ...) were the metric label set. They are not — the real labels are only source/runtime_mode/provider/terminal_status/failure_reason. - Replace the agent_task_* sections with the actual metric names and labels (multica_agent_task_*; see business.go / labels.go), and explain that completed/failed/cancelled are terminal_status values on multica_agent_task_terminal_total, with wall-clock in the *_seconds histograms. - Tighten the runtime_*/autopilot_run_* banners so id properties aren't mistaken for labels. - Drop the stale AgentTask allow-list reference from the pairing lint test header comment. Co-authored-by: multica-agent <github@multica.ai>
chore(analytics): stop shipping operational events to PostHog (MUL-2967)
2026-06-28 18:09:14 +02:00 · 2026-06-03 23:54:48 +08:00 · 2026-06-03 22:40:34 +08:00
7 changed files with 209 additions and 257 deletions
--- a/docs/analytics.md
+++ b/docs/analytics.md
@@ -6,6 +6,20 @@ drives our weekly Active Workspaces (WAW) north-star metric.

 See [MUL-1122](https://github.com/multica-ai/multica) for the design context.

+> **PostHog is reserved for user/product-behaviour events.** High-volume
+> operational / execution-lifecycle telemetry — runtime lifecycle
+> (`runtime_registered` / `runtime_ready` / `runtime_failed` /
+> `runtime_offline`), the agent task lifecycle (`agent_task_*`), and autopilot
+> run lifecycle (`autopilot_run_started` / `autopilot_run_completed` /
+> `autopilot_run_failed`) — is **Prometheus-only** and is **not** shipped to
+> PostHog. Grafana already covers it and the per-event PostHog ingestion cost
+> (these events dominate volume and bill at the identified-event rate) is not
+> justified. The runtime/autopilot events are flagged by
+> `analytics.IsMetricsOnly`, which `metrics.RecordEvent` consults to skip the
+> PostHog `Capture` while still incrementing the Prometheus counter; the
+> `agent_task_*` lifecycle is recorded straight to Prometheus via the typed
+> `BusinessMetrics.RecordTask*` methods and has no `analytics.Event` at all.
+
 ## Configuration

 All analytics shipping is toggled by environment variables (see `.env.example`):
@@ -89,15 +103,18 @@ Every event is assigned to one dashboard category:

 | Category | Events |
 |---|---|
-| `core_loop` | `workspace_created`, `runtime_registered`, `runtime_ready`, `runtime_failed`, `runtime_offline`, `agent_created`, `issue_created`, `chat_message_sent`, `agent_task_queued`, `agent_task_dispatched`, `agent_task_started`, `agent_task_completed`, `agent_task_failed`, `agent_task_cancelled`, `autopilot_run_started`, `autopilot_run_completed`, `autopilot_run_failed` |
+| `core_loop` | `workspace_created`, `agent_created`, `issue_created`, `chat_message_sent`, `issue_executed`, `autopilot_created`, `squad_created` |
 | `onboarding_support` | `onboarding_started`, `onboarding_questionnaire_submitted`, `onboarding_completed`, `onboarding_runtime_path_selected`, `onboarding_runtime_detected` |
 | `acquisition` | `signup`, `download_intent_expressed`, `download_page_viewed`, `download_initiated`, `cloud_waitlist_joined`, `contact_sales_submitted` |
 | `ops_feedback` | `feedback_opened`, `feedback_submitted` |
 | `system/noise` | `$pageview`, `$set`, `$identify`, `$autocapture`, `$rageclick` |
+| `operational (Prometheus-only — NOT in PostHog)` | `runtime_registered`, `runtime_ready`, `runtime_failed`, `runtime_offline`, `agent_task_queued`, `agent_task_dispatched`, `agent_task_started`, `agent_task_completed`, `agent_task_failed`, `agent_task_cancelled`, `autopilot_run_started`, `autopilot_run_completed`, `autopilot_run_failed` |

 The v0 core dashboard must use only `core_loop` plus the specific
 `onboarding_support` steps used by the activation funnel. Acquisition,
-feedback, and system/noise events stay in separate dashboards.
+feedback, and system/noise events stay in separate dashboards. The
+`operational` row is **not shipped to PostHog** — those signals live in
+Grafana via `multica_*` business counters (see `server/internal/metrics`).

 ## Standard core properties

@@ -165,6 +182,13 @@ funnel with "first time user does X" or a cohort on

 ### `runtime_registered`

+> **Prometheus-only — not shipped to PostHog** (see the note at the top of this
+> doc). The `analytics.Event` is still constructed so `metrics.IncForEvent` can
+> derive the Prometheus counter; the fields below are that **event** shape, not
+> a PostHog contract. Only the low-cardinality fields (`runtime_mode`,
+> `provider`) become Prometheus labels — ids like `runtime_id` / `daemon_id`
+> are not labels.
+
 Fires the first time a `(workspace_id, daemon_id, provider)` tuple is
 upserted. Heartbeats and repeat registrations never re-emit. First-time
 detection uses Postgres `xmax = 0` on the upsert RETURNING clause — no
@@ -186,6 +210,8 @@ under a single "anonymous" person.

 ### `runtime_ready`

+> **Prometheus-only — not shipped to PostHog.**
+
 Fires when a runtime is first registered in an online/ready state. This is the
 activation-funnel step that should replace treating `runtime_registered` as
 proof of readiness. The backend emits this only on the INSERT path for a new
@@ -203,6 +229,8 @@ distinct `runtime_id`.

 ### `runtime_failed`

+> **Prometheus-only — not shipped to PostHog.**
+
 Fires when runtime setup/registration fails before a ready runtime can be
 recorded. Today this is scoped to backend registration persistence failures;
 future setup flows should reuse it for provider detection or daemon boot
@@ -218,6 +246,8 @@ failures.

 ### `runtime_offline`

+> **Prometheus-only — not shipped to PostHog.**
+
 Fires when a runtime is explicitly deregistered or the backend sweeper marks it
 offline after missed heartbeats. This is not an activation step; it supports
 local runtime retention and drop-off diagnosis.
@@ -247,40 +277,52 @@ is queued.
 | `agent_id` | string (UUID) | Chat agent. |
 | `source` | string | Always `chat`. |

-### `agent_task_queued` / `agent_task_dispatched` / `agent_task_started` / `agent_task_completed`
+### agent task lifecycle (Prometheus-only)

-Canonical task lifecycle events emitted from `agent_task_queue` state
-transitions. `agent_task_dispatched` fires when the backend claims a queued
-task for a runtime, before the daemon marks it running with
-`agent_task_started`. These events replace `issue_executed` for core loop
-success metrics and allow the activation funnel to split queue backlog from
-claim/start handoff.
+> **Not shipped to PostHog and has no `analytics.Event`.** The agent task
+> lifecycle is recorded directly to Prometheus by the typed
+> `BusinessMetrics.RecordTask*` methods in `server/internal/service/task.go`.
+> The old PostHog event names (`agent_task_queued` / `dispatched` / `started` /
+> `completed` / `failed` / `cancelled`) and their properties (`task_id`,
+> `agent_id`, `issue_id`, `chat_session_id`, `autopilot_run_id`, `duration_ms`,
+> `error_type`, `will_retry`) no longer exist anywhere — those high-cardinality
+> ids were never Prometheus labels and must not be used in dashboards or
+> reconciliation.

-| Property | Type | Description |
+The actual metrics (defined in `server/internal/metrics/business.go`; label
+sets in `server/internal/metrics/labels.go`):
+
+| Metric | Type | Labels |
 |---|---|---|
-| `task_id` | string (UUID) | `agent_task_queue.id`; required. |
-| `agent_id` | string (UUID) | Owning agent. |
-| `issue_id` | string (UUID) | Present for issue-linked tasks. |
-| `chat_session_id` | string (UUID) | Present for chat tasks. |
-| `autopilot_run_id` | string (UUID) | Present for run-only autopilot tasks. |
-| `source` | string | `manual`, `chat`, or `autopilot`. |
-| `runtime_mode` | string | `local` / `cloud`. |
-| `provider` | string | Runtime provider. |
-| `duration_ms` | int64 | Terminal events only; measured from `started_at` when available. |
+| `multica_agent_task_enqueued_total` | counter | `source`, `runtime_mode` |
+| `multica_agent_task_dispatched_total` | counter | `source`, `runtime_mode` |
+| `multica_agent_task_started_total` | counter | `source`, `runtime_mode`, `provider` |
+| `multica_agent_task_terminal_total` | counter | `source`, `runtime_mode`, `terminal_status` |
+| `multica_agent_task_failed_total` | counter | `source`, `runtime_mode`, `failure_reason` |
+| `multica_agent_task_queue_wait_seconds` | histogram | `source`, `runtime_mode` |
+| `multica_agent_task_run_seconds` | histogram | `source`, `runtime_mode`, `terminal_status` |
+| `multica_agent_task_total_seconds` | histogram | `source`, `runtime_mode`, `terminal_status` |

-### `agent_task_failed` / `agent_task_cancelled`
-
-Terminal task lifecycle events. They use the same join fields as
-`agent_task_completed`. `agent_task_failed` also carries:
-
-| Property | Type | Description |
-|---|---|---|
-| `failure_reason` | string | Stable reason from `agent_task_queue.failure_reason`, default `agent_error`. |
-| `error_type` | string | Stable coarse classifier, e.g. `runtime`, `timeout`, `agent_output`, `cancelled`, `agent_error`. |
-| `will_retry` | bool | Whether the backend auto-retry policy will create another task attempt. |
+- `terminal_status` is the task's final `agent_task_queue.status` —
+  `completed` / `failed` / `cancelled`. There is **no** separate
+  completed/cancelled metric: all three land on
+  `multica_agent_task_terminal_total{terminal_status=…}`. Failures
+  additionally increment `multica_agent_task_failed_total` carrying the coarse
+  `failure_reason` (`agent_task_queue.failure_reason`, default `agent_error`).
+- Task wall-clock lives in the `*_seconds` histograms (queue wait / run /
+  total), replacing the old `duration_ms` event property.
+- `source` / `runtime_mode` / `provider` are the normalized label values
+  (`NormalizeTaskSource` / `NormalizeRuntimeMode` / `NormalizeRuntimeProvider`).

 ### `autopilot_run_started` / `autopilot_run_completed` / `autopilot_run_failed`

+> **Prometheus-only — not shipped to PostHog.** The `analytics.*` constructors
+> are retained only so `metrics.IncForEvent` can derive the Prometheus counter;
+> `analytics.IsMetricsOnly` keeps them out of PostHog. Only `cadence`,
+> `trigger_kind`, and `terminal_status` become Prometheus labels — the
+> `autopilot_id` / `autopilot_run_id` / `agent_id` fields below are event shape,
+> not labels.
+
 Fires from `autopilot_run` lifecycle changes. `source` is always
 `autopilot`; the trigger origin is carried in `trigger_source` (`manual`,
 `schedule`, `webhook`, or `api`).
@@ -329,9 +371,11 @@ emit `n=1`. PostHog answers the same question at query time via
 and funnel steps of the form "workspace has had ≥2 `issue_executed`
 events" are expressible without the property. No information is lost.

-Compatibility: `issue_executed` remains a historical compatibility event for
-old dashboards. New core-loop success dashboards should use
-`agent_task_completed` and filter by `source`/`issue_id` as needed.
+`issue_executed` is the canonical **PostHog** core-loop success signal (the
+`agent_task_*` lifecycle that previously served per-task success dashboards is
+now Prometheus-only). Per-task completion counts live in Grafana via
+`BusinessMetrics.RecordTaskTerminal`; use `issue_executed` for the
+PostHog-side activation funnel and filter by `source` as needed.

 ### `team_invite_sent`

@@ -604,8 +648,10 @@ sent from a pre-workspace surface.

 ## Reconciliation

-`agent_task_completed` is the canonical PostHog-side task success event. It
-should reconcile daily against the operational source of truth:
+Per-task completion is no longer shipped to PostHog. Task success now
+reconciles **DB ↔ Prometheus** instead of DB ↔ PostHog: the
+`BusinessMetrics.RecordTaskTerminal` counter (exported as a `multica_*` task
+metric) should track the operational source of truth:

 ```sql
 SELECT date_trunc('day', completed_at AT TIME ZONE 'UTC') AS day,
@@ -617,22 +663,13 @@ GROUP BY 1
 ORDER BY 1;
 ```

-Equivalent HogQL:
+Compare against the equivalent Prometheus counter in Grafana. The expected
+difference should be near zero; sustained drift means either an emission site
+is missing or the metrics pipeline is unhealthy.

-```sql
-SELECT toStartOfDay(timestamp) AS day,
-       count() AS posthog_completed_tasks
-FROM events
-WHERE event = 'agent_task_completed'
-  AND properties.environment = 'production'
-  AND timestamp >= now() - interval 30 day
-GROUP BY day
-ORDER BY day
-```
-
-The expected difference should be near zero. Allow a small delay window for
-PostHog ingestion and backend analytics queue drops; sustained drift means
-either an emission site is missing or PostHog shipping is unhealthy.
+On the PostHog side, `issue_executed` remains the product-level success signal
+(at most one per issue) and can be reconciled against
+`issue.first_executed_at` if needed.

 ## Governance

--- a/server/internal/analytics/events.go
+++ b/server/internal/analytics/events.go
@@ -13,12 +13,6 @@ const (
 	EventIssueExecuted                 = "issue_executed"
 	EventIssueCreated                  = "issue_created"
 	EventChatMessageSent               = "chat_message_sent"
-	EventAgentTaskQueued               = "agent_task_queued"
-	EventAgentTaskDispatched           = "agent_task_dispatched"
-	EventAgentTaskStarted              = "agent_task_started"
-	EventAgentTaskCompleted            = "agent_task_completed"
-	EventAgentTaskFailed               = "agent_task_failed"
-	EventAgentTaskCancelled            = "agent_task_cancelled"
 	EventAutopilotRunStarted           = "autopilot_run_started"
 	EventAutopilotRunCompleted         = "autopilot_run_completed"
 	EventAutopilotRunFailed            = "autopilot_run_failed"
@@ -37,6 +31,35 @@ const (

 const EventSchemaVersion = 2

+// metricsOnlyEvents are operational / execution-lifecycle events that are
+// recorded to Prometheus (via metrics.IncForEvent, for Grafana) but are
+// deliberately NOT shipped to PostHog. They are high-volume runtime/autopilot
+// telemetry whose per-event PostHog ingestion cost is not justified — Grafana
+// already carries the equivalent counters. metrics.RecordEvent consults this
+// set and skips the PostHog Capture for these names while still incrementing
+// the counter. PostHog is reserved for user/product-behaviour events.
+//
+// Note: agent_task_* lifecycle events are also Prometheus-only, but their
+// Prometheus side is handled by typed BusinessMetrics.RecordTask* methods, so
+// they never build an analytics.Event in the first place and don't need an
+// entry here.
+var metricsOnlyEvents = map[string]struct{}{
+	EventRuntimeRegistered:     {},
+	EventRuntimeReady:          {},
+	EventRuntimeFailed:         {},
+	EventRuntimeOffline:        {},
+	EventAutopilotRunStarted:   {},
+	EventAutopilotRunCompleted: {},
+	EventAutopilotRunFailed:    {},
+}
+
+// IsMetricsOnly reports whether an event name is operational telemetry that
+// must be counted in Prometheus but not sent to PostHog. See metricsOnlyEvents.
+func IsMetricsOnly(name string) bool {
+	_, ok := metricsOnlyEvents[name]
+	return ok
+}
+
 const (
 	SourceOnboarding = "onboarding"
 	SourceManual     = "manual"
@@ -306,39 +329,6 @@ func ChatMessageSent(userID, workspaceID, chatSessionID, taskID, agentID, runtim
 	}
 }

-func AgentTaskQueued(ctx TaskContext) Event {
-	return agentTaskEvent(EventAgentTaskQueued, ctx, nil)
-}
-
-func AgentTaskDispatched(ctx TaskContext) Event {
-	return agentTaskEvent(EventAgentTaskDispatched, ctx, nil)
-}
-
-func AgentTaskStarted(ctx TaskContext) Event {
-	return agentTaskEvent(EventAgentTaskStarted, ctx, nil)
-}
-
-func AgentTaskCompleted(ctx TaskContext, durationMS int64) Event {
-	return agentTaskEvent(EventAgentTaskCompleted, ctx, map[string]any{
-		"duration_ms": durationMS,
-	})
-}
-
-func AgentTaskFailed(ctx TaskContext, durationMS int64, failureReason, errorType string, willRetry bool) Event {
-	return agentTaskEvent(EventAgentTaskFailed, ctx, map[string]any{
-		"duration_ms":    durationMS,
-		"failure_reason": failureReason,
-		"error_type":     errorType,
-		"will_retry":     willRetry,
-	})
-}
-
-func AgentTaskCancelled(ctx TaskContext, durationMS int64) Event {
-	return agentTaskEvent(EventAgentTaskCancelled, ctx, map[string]any{
-		"duration_ms": durationMS,
-	})
-}
-
 // AutopilotAssignee describes the autopilot's configured target. agent_id is
 // always the agent that will actually execute the work (the squad leader for
 // squad autopilots) so funnels grouping by agent stay consistent. assignee_*
@@ -657,16 +647,6 @@ func AutopilotCreated(actorID, workspaceID, autopilotID, cadence, triggerKind st
 	}
 }

-func agentTaskEvent(name string, ctx TaskContext, extra map[string]any) Event {
-	props := withCoreProperties(extra, CoreProperties(ctx))
-	return Event{
-		Name:        name,
-		DistinctID:  distinctID(ctx.UserID, ctx.WorkspaceID, ctx.AgentID),
-		WorkspaceID: ctx.WorkspaceID,
-		Properties:  props,
-	}
-}
-
 func autopilotRunEvent(name, actorID, workspaceID, autopilotID, runID, cadence string, assignee AutopilotAssignee, triggerSource string, extra map[string]any) Event {
 	if extra == nil {
 		extra = map[string]any{}
@@ -733,20 +713,6 @@ func withCoreProperties(props map[string]any, core CoreProperties) map[string]an
 	return props
 }

-func distinctID(userID, workspaceID, agentID string) string {
-	if userID != "" {
-		return userID
-	}
-	// Synthetic PostHog distinct IDs are namespace-prefixed; user UUIDs are not.
-	if agentID != "" {
-		return "agent:" + agentID
-	}
-	if workspaceID != "" {
-		return "workspace:" + workspaceID
-	}
-	return ""
-}
-
 func nonAgentUserID(distinct string) string {
 	if distinct == "" || strings.Contains(distinct, ":") {
 		return ""
--- a/server/internal/analytics/events_test.go
+++ b/server/internal/analytics/events_test.go
@@ -15,21 +15,6 @@ func TestRuntimeReadyOmitsUnmeasuredDuration(t *testing.T) {
 }

 func TestFailedEventsUseWillRetry(t *testing.T) {
-	ctx := TaskContext{
-		UserID:      "user-1",
-		WorkspaceID: "workspace-1",
-		AgentID:     "agent-1",
-		TaskID:      "task-1",
-		Source:      SourceManual,
-	}
-	taskEv := AgentTaskFailed(ctx, 10, "runtime_offline", "runtime", true)
-	if got := taskEv.Properties["will_retry"]; got != true {
-		t.Fatalf("task will_retry = %v, want true", got)
-	}
-	if _, ok := taskEv.Properties["recoverable"]; ok {
-		t.Fatalf("task failure should not emit recoverable")
-	}
-
 	runEv := AutopilotRunFailed("user-1", "workspace-1", "autopilot-1", "run-1", "manual", AutopilotAssignee{AgentID: "agent-1", AssigneeType: "agent"}, "manual", "task failed", "task_error", false, 10)
 	if got := runEv.Properties["will_retry"]; got != false {
 		t.Fatalf("autopilot will_retry = %v, want false", got)
@@ -39,29 +24,24 @@ func TestFailedEventsUseWillRetry(t *testing.T) {
 	}
 }

-func TestAgentTaskDispatchedUsesTaskCoreProperties(t *testing.T) {
-	ctx := TaskContext{
-		UserID:      "user-1",
-		WorkspaceID: "workspace-1",
-		AgentID:     "agent-1",
-		TaskID:      "task-1",
-		IssueID:     "issue-1",
-		Source:      SourceManual,
-		RuntimeMode: "local",
-		Provider:    "codex",
+func TestIsMetricsOnly(t *testing.T) {
+	// Operational / execution-lifecycle events are Prometheus-only and must
+	// not be shipped to PostHog.
+	for _, name := range []string{
+		EventRuntimeRegistered, EventRuntimeReady, EventRuntimeFailed, EventRuntimeOffline,
+		EventAutopilotRunStarted, EventAutopilotRunCompleted, EventAutopilotRunFailed,
+	} {
+		if !IsMetricsOnly(name) {
+			t.Errorf("IsMetricsOnly(%q) = false, want true (operational event must stay out of PostHog)", name)
+		}
 	}
-	ev := AgentTaskDispatched(ctx)
-
-	if ev.Name != EventAgentTaskDispatched {
-		t.Fatalf("event name = %q, want %q", ev.Name, EventAgentTaskDispatched)
-	}
-	if got := ev.WorkspaceID; got != "workspace-1" {
-		t.Fatalf("workspace_id = %q, want workspace-1", got)
-	}
-	if got := ev.Properties["task_id"]; got != "task-1" {
-		t.Fatalf("task_id = %v, want task-1", got)
-	}
-	if got := ev.Properties["runtime_mode"]; got != "local" {
-		t.Fatalf("runtime_mode = %v, want local", got)
+	// Product-behaviour events must still reach PostHog.
+	for _, name := range []string{
+		EventSignup, EventWorkspaceCreated, EventIssueCreated, EventIssueExecuted,
+		EventChatMessageSent, EventAgentCreated, EventAutopilotCreated,
+	} {
+		if IsMetricsOnly(name) {
+			t.Errorf("IsMetricsOnly(%q) = true, want false (product event must reach PostHog)", name)
+		}
 	}
 }
--- a/server/internal/metrics/business_events.go
+++ b/server/internal/metrics/business_events.go
@@ -239,13 +239,18 @@ func (e *businessEventMetrics) collectors() []prometheus.Collector {
 // `m = nil` (no metrics) safely; both sides are best-effort and never block
 // the request path.
 //
+// Operational / execution-lifecycle events flagged by analytics.IsMetricsOnly
+// (runtime_*, autopilot_run_*) still increment their Prometheus counter but are
+// NOT shipped to PostHog — Grafana already covers them and their high volume is
+// not worth the per-event PostHog ingestion cost. PostHog is reserved for
+// user/product-behaviour events.
+//
 // This is the canonical way to emit any of the funnel / community / commercial
 // PostHog events from server code. Direct analytics.Client.Capture(...) with
 // an event constructed from analytics.* is rejected by the lint test in
-// business_pairing_test.go — see that test for the allow-list of events whose
-// Prometheus side is handled by typed BusinessMetrics methods (multica_agent_task_*).
+// business_pairing_test.go.
 func RecordEvent(client analytics.Client, m *BusinessMetrics, ev analytics.Event) {
-	if client != nil {
+	if client != nil && !analytics.IsMetricsOnly(ev.Name) {
 		client.Capture(ev)
 	}
 	if m != nil {
@@ -344,11 +349,11 @@ func (m *BusinessMetrics) IncForEvent(ev analytics.Event) {
 	case analytics.EventContactSalesSubmitted:
 		m.events.contactSalesSubmitted.WithLabelValues(NormalizeContactSalesSource(stringProp(ev.Properties, "form_source"))).Inc()
 	default:
-		// AgentTask* events are intentionally not dispatched here — their
-		// Prometheus side is handled by typed BusinessMetrics.RecordTask*
-		// methods that take queue/run/total seconds the analytics event
-		// does not carry. The lint test allow-lists those names.
-		// Anything else is a missing case and the lint test will fail CI.
+		// agent_task_* lifecycle telemetry is recorded straight to Prometheus
+		// via the typed BusinessMetrics.RecordTask* methods (they take
+		// queue/run/total seconds that an analytics.Event does not carry), so
+		// there is no analytics.Event to dispatch here. Anything else reaching
+		// this default is a missing case and the lint test will fail CI.
 	}
 }

--- a/server/internal/metrics/business_pairing_test.go
+++ b/server/internal/metrics/business_pairing_test.go
@@ -4,8 +4,10 @@ package metrics_test
 // server/internal/analytics/events.go has a paired Prometheus counter
 // reachable through metrics.RecordEvent — and that every
 // h.Analytics.Capture(analytics.<Helper>(...)) call site goes through
-// metrics.RecordEvent (no naked Capture allowed except for the AgentTask*
-// allow-list whose Prometheus side is handled by typed PR2 methods).
+// metrics.RecordEvent (no naked Capture allowed). The agent task lifecycle is
+// no longer an analytics.Event — it is recorded straight to Prometheus via the
+// typed BusinessMetrics.RecordTask* methods — so there is no longer an
+// AgentTask* allow-list here.

 import (
 	"go/ast"
@@ -21,22 +23,6 @@ import (
 	"github.com/multica-ai/multica/server/internal/metrics"
 )

-// taskMetricEvents are emitted via the typed PR2 methods (RecordTaskEnqueued,
-// RecordTaskDispatched, RecordTaskStarted, RecordTaskTerminal, RecordTaskFailed)
-// instead of the generic RecordEvent dispatcher because their Prometheus side
-// needs queue/run/total seconds that the analytics event does not carry.
-//
-// These names are still required to be paired — the lint test verifies they
-// have a typed RecordTask* hit in service/task.go.
-var taskMetricEvents = map[string]string{
-	analytics.EventAgentTaskQueued:     "RecordTaskEnqueued",
-	analytics.EventAgentTaskDispatched: "RecordTaskDispatched",
-	analytics.EventAgentTaskStarted:    "RecordTaskStarted",
-	analytics.EventAgentTaskCompleted:  "RecordTaskTerminal",
-	analytics.EventAgentTaskFailed:     "RecordTaskFailed",
-	analytics.EventAgentTaskCancelled:  "RecordTaskTerminal",
-}
-
 // frontendOnlyEvents are declared in events.go but emitted from the frontend,
 // not from server code. They still need a Prometheus counter (so a future
 // server-side emission point lights up the same label set) but the server
@@ -46,10 +32,13 @@ var frontendOnlyEvents = map[string]bool{
 }

 // TestEveryAnalyticsEventHasPrometheusCounter asserts that every Event*
-// constant declared in analytics/events.go either:
-//   - is dispatched by metrics.IncForEvent (verified by sending a synthetic
-//     event through RecordEvent and observing a counter delta), or
-//   - is in the typed taskMetricEvents allow-list.
+// constant declared in analytics/events.go is dispatched by
+// metrics.IncForEvent (verified by sending a synthetic event through
+// RecordEvent and observing a counter delta).
+//
+// Note: agent_task_* lifecycle telemetry is Prometheus-only via the typed
+// BusinessMetrics.RecordTask* methods and is no longer declared as an
+// analytics.Event, so there are no agent_task constants to exempt here.
 func TestEveryAnalyticsEventHasPrometheusCounter(t *testing.T) {
 	t.Parallel()

@@ -57,9 +46,6 @@ func TestEveryAnalyticsEventHasPrometheusCounter(t *testing.T) {

 	m := metrics.NewBusinessMetrics()
 	for name := range declared {
-		if _, allowed := taskMetricEvents[name]; allowed {
-			continue
-		}
 		// Build a minimal event with the required label properties that the
 		// dispatcher reads. Since IncForEvent reads via stringProp helpers,
 		// a nil Properties map is acceptable for events with empty label
@@ -78,11 +64,10 @@ func TestEveryAnalyticsEventHasPrometheusCounter(t *testing.T) {

 // TestNoNakedAnalyticsCaptureInHandlersOrServices walks every Go file under
 // server/internal/handler and server/internal/service and asserts that every
-// `<x>.Analytics.Capture(analytics.<Helper>(...))` call has been migrated to
-// metrics.RecordEvent. The only exception is the body of
-// `service/task.go`'s captureTaskEvent function — a function-granular
-// allow-list, not a whole-file one, so any new naked Capture added to the
-// same file fails CI.
+// `<x>.Analytics.Capture(analytics.<Helper>(...))` call goes through
+// metrics.RecordEvent. There are no exceptions: every server-side PostHog
+// event must flow through RecordEvent so the Prometheus and PostHog sides
+// cannot drift.
 func TestNoNakedAnalyticsCaptureInHandlersOrServices(t *testing.T) {
 	t.Parallel()

@@ -93,13 +78,10 @@ func TestNoNakedAnalyticsCaptureInHandlersOrServices(t *testing.T) {
 	}
 	// allowedFunctions is keyed by absolute file path, valued by the set of
 	// function names whose bodies are allowed to call Analytics.Capture
-	// directly. Granularity is per-function, not per-file: anything else
-	// in the same file still trips the check.
-	allowedFunctions := map[string]map[string]struct{}{
-		filepath.Join(repoRoot(t), "internal", "service", "task.go"): {
-			"captureTaskEvent": {},
-		},
-	}
+	// directly. Granularity is per-function, not per-file. Currently empty —
+	// no server code is permitted to call Analytics.Capture outside
+	// metrics.RecordEvent.
+	allowedFunctions := map[string]map[string]struct{}{}

 	var offenders []string
 	fset := token.NewFileSet()
--- a/server/internal/metrics/record_event_test.go
+++ b/server/internal/metrics/record_event_test.go
@@ -0,0 +1,44 @@
+package metrics_test
+
+import (
+	"testing"
+
+	"github.com/multica-ai/multica/server/internal/analytics"
+	"github.com/multica-ai/multica/server/internal/metrics"
+)
+
+// captureSpy records the names of every event handed to Capture so tests can
+// assert which events actually reach PostHog.
+type captureSpy struct{ names []string }
+
+func (c *captureSpy) Capture(e analytics.Event) { c.names = append(c.names, e.Name) }
+func (c *captureSpy) Close()                    {}
+
+// TestRecordEventSkipsPostHogForMetricsOnly verifies that operational events
+// flagged by analytics.IsMetricsOnly still increment a Prometheus counter but
+// are NOT shipped to PostHog, while product-behaviour events are shipped.
+func TestRecordEventSkipsPostHogForMetricsOnly(t *testing.T) {
+	spy := &captureSpy{}
+	m := metrics.NewBusinessMetrics()
+
+	// Operational event: Prometheus counter moves, PostHog gets nothing.
+	before := metrics.SumAllCounters(m)
+	metrics.RecordEvent(spy, m, analytics.RuntimeOffline("user-1", "ws-1", "rt-1", "daemon-1", "claude"))
+	if len(spy.names) != 0 {
+		t.Fatalf("runtime_offline shipped %d events to PostHog, want 0: %v", len(spy.names), spy.names)
+	}
+	if metrics.SumAllCounters(m) <= before {
+		t.Fatalf("runtime_offline did not increment a Prometheus counter")
+	}
+
+	metrics.RecordEvent(spy, m, analytics.AutopilotRunCompleted("user-1", "ws-1", "ap-1", "run-1", "manual", analytics.AutopilotAssignee{AgentID: "agent-1", AssigneeType: "agent"}, "manual", 10))
+	if len(spy.names) != 0 {
+		t.Fatalf("autopilot_run_completed shipped to PostHog, want 0: %v", spy.names)
+	}
+
+	// Product-behaviour event: still shipped to PostHog.
+	metrics.RecordEvent(spy, m, analytics.WorkspaceCreated("user-1", "ws-1"))
+	if len(spy.names) != 1 || spy.names[0] != analytics.EventWorkspaceCreated {
+		t.Fatalf("workspace_created was not shipped to PostHog exactly once: %v", spy.names)
+	}
+}
--- a/server/internal/service/task.go
+++ b/server/internal/service/task.go
@@ -140,7 +140,6 @@ func (s *TaskService) captureTaskQueued(ctx context.Context, task db.AgentTaskQu
 		source, runtimeMode, _ := s.taskMetricsContext(ctx, task)
 		s.Metrics.RecordTaskEnqueued(source, runtimeMode)
 	}
-	s.captureTaskEvent(ctx, analytics.AgentTaskQueued(s.taskAnalyticsContext(ctx, task)))
 }

 func (s *TaskService) captureTaskDispatched(ctx context.Context, task db.AgentTaskQueue) {
@@ -148,7 +147,6 @@ func (s *TaskService) captureTaskDispatched(ctx context.Context, task db.AgentTa
 		source, runtimeMode, _ := s.taskMetricsContext(ctx, task)
 		s.Metrics.RecordTaskDispatched(util.UUIDToString(task.ID), source, runtimeMode, taskQueueWaitSeconds(task))
 	}
-	s.captureTaskEvent(ctx, analytics.AgentTaskDispatched(s.taskAnalyticsContext(ctx, task)))
 }

 func (s *TaskService) AnalyticsContextForTask(ctx context.Context, task db.AgentTaskQueue) analytics.TaskContext {
@@ -160,7 +158,6 @@ func (s *TaskService) captureTaskStarted(ctx context.Context, task db.AgentTaskQ
 		source, runtimeMode, provider := s.taskMetricsContext(ctx, task)
 		s.Metrics.RecordTaskStarted(source, runtimeMode, provider)
 	}
-	s.captureTaskEvent(ctx, analytics.AgentTaskStarted(s.taskAnalyticsContext(ctx, task)))
 }

 func (s *TaskService) captureTaskCompleted(ctx context.Context, task db.AgentTaskQueue) {
@@ -168,10 +165,6 @@ func (s *TaskService) captureTaskCompleted(ctx context.Context, task db.AgentTas
 		source, runtimeMode, _ := s.taskMetricsContext(ctx, task)
 		s.Metrics.RecordTaskTerminal(util.UUIDToString(task.ID), source, runtimeMode, task.Status, taskRunSeconds(task), taskTotalSeconds(task), task.Attempt)
 	}
-	s.captureTaskEvent(ctx, analytics.AgentTaskCompleted(
-		s.taskAnalyticsContext(ctx, task),
-		taskDurationMS(task),
-	))
 }

 func (s *TaskService) captureTaskFailed(ctx context.Context, task db.AgentTaskQueue) {
@@ -181,13 +174,6 @@ func (s *TaskService) captureTaskFailed(ctx context.Context, task db.AgentTaskQu
 		s.Metrics.RecordTaskTerminal(util.UUIDToString(task.ID), source, runtimeMode, task.Status, taskRunSeconds(task), taskTotalSeconds(task), task.Attempt)
 		s.Metrics.RecordTaskFailed(source, runtimeMode, failureReason)
 	}
-	s.captureTaskEvent(ctx, analytics.AgentTaskFailed(
-		s.taskAnalyticsContext(ctx, task),
-		taskDurationMS(task),
-		failureReason,
-		taskErrorType(failureReason),
-		s.willRetryTask(task),
-	))
 }

 func (s *TaskService) captureTaskCancelled(ctx context.Context, task db.AgentTaskQueue) {
@@ -195,10 +181,6 @@ func (s *TaskService) captureTaskCancelled(ctx context.Context, task db.AgentTas
 		source, runtimeMode, _ := s.taskMetricsContext(ctx, task)
 		s.Metrics.RecordTaskTerminal(util.UUIDToString(task.ID), source, runtimeMode, task.Status, taskRunSeconds(task), taskTotalSeconds(task), task.Attempt)
 	}
-	s.captureTaskEvent(ctx, analytics.AgentTaskCancelled(
-		s.taskAnalyticsContext(ctx, task),
-		taskDurationMS(task),
-	))
 	// Revoke any mat_ task tokens minted for this task. Cancellation is
 	// a terminal transition, so the running agent process no longer
 	// needs to call back; eagerly deleting the token closes the
@@ -239,16 +221,6 @@ func (s *TaskService) CaptureLeaseExpiredTasks(ctx context.Context, tasks []db.A
 	}
 }

-func (s *TaskService) captureTaskEvent(ctx context.Context, event analytics.Event) {
-	if s.Analytics == nil {
-		return
-	}
-	if event.WorkspaceID == "" {
-		return
-	}
-	s.Analytics.Capture(event)
-}
-
 func (s *TaskService) cachedTaskAnalyticsContext(task db.AgentTaskQueue) (analytics.TaskContext, bool) {
 	key := taskAnalyticsContextKey(task)
 	if key == "" {
@@ -410,26 +382,6 @@ func (s *TaskService) taskAnalyticsContext(ctx context.Context, task db.AgentTas
 	return tc
 }

-func taskDurationMS(task db.AgentTaskQueue) int64 {
-	if !task.CompletedAt.Valid {
-		return 0
-	}
-	start := task.CreatedAt
-	if task.StartedAt.Valid {
-		start = task.StartedAt
-	} else if task.DispatchedAt.Valid {
-		start = task.DispatchedAt
-	}
-	if !start.Valid {
-		return 0
-	}
-	ms := task.CompletedAt.Time.Sub(start.Time).Milliseconds()
-	if ms < 0 {
-		return 0
-	}
-	return ms
-}
-
 func taskQueueWaitSeconds(task db.AgentTaskQueue) float64 {
 	return durationSeconds(task.CreatedAt, task.DispatchedAt)
 }
@@ -475,20 +427,6 @@ func taskErrorType(reason string) string {
 	}
 }

-func (s *TaskService) willRetryTask(task db.AgentTaskQueue) bool {
-	reason := taskFailureReason(task)
-	if !retryableReasons[reason] {
-		return false
-	}
-	if task.Attempt >= task.MaxAttempts {
-		return false
-	}
-	if task.AutopilotRunID.Valid {
-		return false
-	}
-	return task.IssueID.Valid || task.ChatSessionID.Valid
-}
-
 // EnqueueTaskForIssue creates a queued task for an agent-assigned issue.
 // No context snapshot is stored — the agent fetches all data it needs at
 // runtime via the multica CLI.