Files
multica/server/migrations/050_issue_first_executed_at.down.sql
devv-eve 637bdc8eb3 feat(analytics): full PostHog pipeline + 6 funnel events (MUL-1122) (#1367)
* feat(analytics): add PostHog client with async batch shipping

Introduces server/internal/analytics, the shipping layer for the product
funnel defined in docs/analytics.md. Capture is non-blocking — events are
enqueued into a bounded channel and a background worker batches them to
PostHog's /batch/ endpoint. A broken backend drops events rather than
blocking request handlers.

Local dev and self-hosted instances run a noop client until the operator
sets POSTHOG_API_KEY. This is PR 1 of MUL-1122; signup and workspace_created
emission land in the follow-up commit so this change is independently
reviewable.

* feat(server): emit signup and workspace_created analytics events

Wires analytics.Client through handler.New and main, then emits the first
two funnel events:

- signup fires from findOrCreateUser (which now reports isNew), covering
  both the verification-code and Google OAuth entry points — a single
  emission site guarantees Google signups aren't missed.
- workspace_created fires after the CreateWorkspace transaction commits,
  with is_first_workspace computed from a post-commit ListWorkspaces count
  so we can distinguish fresh-user activation from returning-user
  expansion.

Tests use analytics.NoopClient so nothing ships from test runs. PR 1 of
MUL-1122; runtime_registered and issue_executed follow in later PRs per
the plan.

* refactor(analytics): drop is_first_workspace from workspace_created

Stamping "is this the user's first workspace?" at emit time races under
concurrent CreateWorkspace requests: two transactions committing close
together can both read a post-commit count greater than one and both emit
false. Fixing it at the SQL layer requires a schema change we don't want in
PR 1.

PostHog answers the same question exactly from the event stream (funnel on
"first time user does X" / cohort on $initial_event), so removing the
property loses no information and makes the emit side race-free.

* docs(analytics): document self-host safety defaults

Spell out why self-hosted instances never ship events upstream by default
(empty POSTHOG_API_KEY → noop client) and explain how operators can point
at their own PostHog project without any code change.

* feat(analytics): emit runtime_registered, issue_executed, team_invite_*

Three server-side funnel events, all gated on first-time state transitions
so retries and re-runs don't inflate the WAW buckets:

- runtime_registered fires from DaemonRegister when UpsertAgentRuntime
  reports (xmax = 0) — i.e. the row was inserted, not updated. Heartbeats
  and re-registrations stay silent.
- issue_executed fires from CompleteTask after an atomic
  UPDATE issue SET first_executed_at = now() WHERE id = $1 AND
  first_executed_at IS NULL flips the column for the first time. Retries,
  re-assignments, and comment-triggered follow-up tasks hit the WHERE
  clause and no-op. Carries nth_issue_for_workspace so the ≥1/≥2/≥5/≥10
  buckets filter without extra queries.
- team_invite_sent fires from CreateInvitation and team_invite_accepted
  from AcceptInvitation, closing the expansion funnel.

Adds a 050 migration for issue.first_executed_at plus a partial index so
the workspace-scoped executed-count query doesn't scan the never-executed
tail.

* feat(config): surface PostHog key via /api/config

Extends AppConfig with posthog_key / posthog_host sourced from env on
every request (so operators can rotate the key via secret refresh without
a restart). Reading the key off the server — rather than baking it into
the frontend bundle via NEXT_PUBLIC_* — means self-hosted instances
inherit the blank key automatically and never ship events upstream.

* feat(analytics): wire posthog-js identify + UTM capture on the client

Adds @multica/core/analytics — a thin wrapper around posthog-js that owns
attribution capture and identity merge. Posthog-js config comes from
/api/config (not NEXT_PUBLIC_*), so self-hosted instances whose server
returns an empty key automatically run the SDK inert.

captureSignupSource stamps a multica_signup_source cookie with UTM params
and the referrer's origin (never the full referrer — that can leak OAuth
code/state in the callback URL). The backend signup event reads this
cookie on new-user creation.

Identity flows:
- auth-initializer fires identify() right after getMe() resolves, on both
  cookie and token paths. A getConfig/getMe race is handled by buffering
  a pending identify inside the analytics module and flushing it once
  initAnalytics finishes.
- auth store calls identify() on verifyCode / loginWithGoogle /
  loginWithToken and resetAnalytics() on logout so the next login merges
  cleanly without bleeding events.

* docs(analytics): describe runtime_registered, issue_executed, invite events

Fills in the schema for the remaining funnel events. Captures the
design commentary that belongs next to the contract rather than in a PR
description — in particular why issue_executed uses the atomic
first_executed_at flip instead of counting task-terminal events, and why
runtime_registered relies on xmax = 0 rather than a query-then-write.

* fix(analytics): drop non-atomic nth_issue_for_workspace from issue_executed

Computing the workspace's Nth-issue ordinal at emit time is not atomic
under concurrent first-completions — two transactions can both run
MarkIssueFirstExecuted, then both run CountExecutedIssuesInWorkspace, and
both observe count=1 before either has committed, so both events go out
stamped as n=1. Serialising it would mean a per-workspace advisory lock
or a SERIALIZABLE-isolated tx; PostHog answers the same question exactly
at query time via row_number() partitioned by workspace_id, so the
emit-time property adds risk without adding information.

Removes the property from analytics.IssueExecuted, deletes the unused
CountExecutedIssuesInWorkspace query, and regenerates sqlc. The partial
index stays — any future workspace-scoped executed-issue query will want
it.

* fix(analytics): wire $pageview and harden signup_source cookie payload

Two frontend fixes from the PR review:

- PageviewTracker, mounted under WebProviders, fires capturePageview on
  every Next.js App Router path / query-string change. Without this the
  capturePageview helper in @multica/core/analytics was never called and
  the acquisition funnel's / → signup step was empty.
- captureSignupSource now caps each UTM / referrer value at 96 chars
  *before* JSON.stringify, and drops the whole cookie when the serialised
  payload still exceeds 512 chars. Previously the overall slice(0, 256)
  could leave a half-JSON string on the wire that neither the backend nor
  PostHog could parse.

Both capturePageview and identify now buffer a single pending call when
fired before initAnalytics resolves — otherwise the initial "/" pageview
and same-turn login identify race the /api/config fetch and get dropped.
resetAnalytics clears both buffers so a logout→login cycle stays clean.

* fix(analytics): URL-decode signup_source cookie on read

Go does not URL-decode Cookie.Value automatically, so the frontend's
JSON-then-encodeURIComponent payload was landing in PostHog as
percent-encoded garbage (%7B%22utm_source...). Unescape on read so the
backend receives the original JSON string the frontend intended, and
drop values that fail to decode or exceed the server-side cap — sending
truncated garbage is worse than sending nothing. Oversized-cookie guard
matches the frontend's SIGNUP_SOURCE_MAX_LEN.

* docs(analytics): reflect nth-issue drop, $pageview wiring, cookie encoding

Pulls the schema doc back in line with the code: issue_executed no longer
advertises nth_issue_for_workspace (with a note about why PostHog derives
it at query time instead), the frontend $pageview section names the
actual PageviewTracker component that fires it, and the signup_source
section documents the per-value cap / overall drop rule and the
encode-on-write / decode-on-read contract.

---------

Co-authored-by: Jiang Bohan <bhjiang@outlook.com>
2026-04-21 14:42:52 +08:00

3 lines
109 B
SQL

DROP INDEX IF EXISTS idx_issue_first_executed_at;
ALTER TABLE issue DROP COLUMN IF EXISTS first_executed_at;