multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 21:39:54 +02:00

Author	SHA1	Message	Date
LinYushen	de900b2ba6	feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949) (#3698 ) * feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949) PR3 of the Grafana board metrics split (parent MUL-2328). Adds 23 new Prometheus counter/histogram families to the PR2 BusinessMetrics collector covering the activation/community/commercial funnels, and binds every PostHog event emission to a matching metric increment so the two sides cannot drift. Funnel: signup, workspace_created, team_invite_sent/accepted, onboarding_, cloud_waitlist_joined. Content: issue_created, chat_message_sent, agent_created, squad_created, autopilot_created, issue_executed. Runtime: runtime_registered/ready/failed/offline + ready_seconds histogram, daemon_ws_message_received_total. Autopilot: autopilot_run_started/terminal/skipped. Webhook/GitHub: webhook_delivery_total, github_event_received_total, github_pr_review_total, github_pr_merge_seconds histogram. CloudRuntime: cloudruntime_request_total + duration histogram, wired through a small RequestRecorder interface so the cloudruntime package stays decoupled from metrics. Commercial: feedback_submitted, contact_sales_submitted. The pairing helper metrics.RecordEvent(client, m, ev) emits the PostHog event AND increments the matching counter via IncForEvent dispatch, reading labels from the analytics event Properties. Every existing h.Analytics.Capture(analytics.X(...)) call site has been migrated to the helper across handler/, service/, and cmd/server/runtime_sweeper.go. Lint enforcement (server/internal/metrics/business_pairing_test.go): - TestEveryAnalyticsEventHasPrometheusCounter: every Event constant in analytics/events.go either dispatches via IncForEvent or is in the taskMetricEvents allow-list (PR2 typed RecordTask* methods). - TestNoNakedAnalyticsCaptureInHandlersOrServices: AST-walks handler/ service/cmd-server for direct Analytics.Capture(...) calls — only service/task.go's captureTaskEvent helper is allow-listed. - TestEveryAnalyticsRecordEventTakesAnalyticsHelper: validates the third arg of every metrics.RecordEvent call is built from analytics.. Cardinality protection: all new label values pass through fixed allow-lists in labels_pr3.go; unknown values collapse to 'other'/'unknown'/'error'. Refs: - Spec MUL-2328 / MUL-2949. - Builds on PR2 (MUL-2948) — collectors registered through the same BusinessMetrics struct, no separate Registry. - Uses PR1's taskfailure.Reason (MUL-2946) for runtime_failed's failure_reason label via NormalizeFailureReason. Out of scope: Sampler-class metrics (PR4 / MUL-2947), pr_review_total emission point (no review event handler exists yet — counter is defined, TODO to wire up when /api/webhooks/github grows pull_request_review handling). Co-authored-by: multica-agent <github@multica.ai> fix(server): tighten PR3 review items — signup_source bucket, fill platform/kind/form_source enums, onboarding_started server emission, lint scope (MUL-2949) Addresses 张大彪's review on #3698: 1. signup_source: NormalizeSignupSource added to labels_pr3.go with a fixed allow-list bucket (direct/google/twitter/linkedin/.../other). Parses JSON cookie payload for utm_source/source/referrer fields, strips URL schemes, maps well-known hostnames to channel buckets. PostHog event still ships the raw cookie value for analytics; only the Prometheus label is bucketed. 2. Filled the unknown/other label gaps: - analytics.IssueCreated and analytics.ChatMessageSent now take a platform parameter sourced from middleware.ClientMetadataFromContext (X-Client-Platform header) at the handler. Autopilot-originated issues stamp PlatformServer. - analytics.FeedbackSubmitted now takes a kind parameter; CreateFeedback reads req.Kind (default "general") so the picker selection lights up the metric's kind label instead of long-term "other". - analytics.ContactSalesSubmitted now takes a formSource (page / onboarding / agents_page); CreateContactSales reads req.Source. The metric reads ev.Properties["form_source"] so the analytics CoreProperties.Source ("marketing_contact_sales") stays backward-compat for PostHog dashboards. 3. analytics.OnboardingStarted helper added; server-side emission lives in PatchOnboarding, fired exactly once per user on the first PATCH that carries a non-empty questionnaire payload (firstTouch logic compares prior bytes against {} / null). Frontend onboarding_started keeps firing on page open; the server emission is what guarantees the Prometheus counter exists so Grafana can be cross-checked against the PostHog funnel without depending on the SDK roundtrip. 4. business_pairing_test.go tightened: - TestNoNakedAnalyticsCaptureInHandlersOrServices now allow-lists at function granularity (just captureTaskEvent in service/task.go), not whole-file. Any future naked Capture in the same file fails CI. - TestEveryAnalyticsRecordEventTakesAnalyticsHelper now does def-use tracking inside the enclosing FuncDecl: when RecordEvent's third arg is an ast.Ident, the test walks the function body for the assignment that defined it and confirms the RHS is an analytics.<Helper>(...) call. Bare local idents that didn't originate from analytics are now caught. 5. gofmt -w applied across the touched files; gofmt -l clean. Tests: go test ./internal/metrics/... ./internal/analytics/... pass. Pre-existing TestClaimTask_/TestWebhook_MergedPR/TestDeleteIssueByIdentifier failures on origin/main are DB-environment-dependent and not regressions from this change. Co-authored-by: multica-agent <github@multica.ai> fix(server): normalise onboarding_started platform label + regression test (MUL-2949) Addresses 张大彪's last review nit: - IncForEvent's EventOnboardingStarted case now wraps the platform property with NormalizePlatform, matching every other platform-bearing metric. A misbehaving frontend can no longer leak a raw X-Client-Platform header value into the multica_onboarding_started_total{platform=...} series. - New labels_pr3_test.go covers every PR3 normalizer with both a happy-path value and an unknown value, asserting the unknown collapses to the documented fallback bucket. Includes a focused regression for onboarding_started: emits one event with an attacker-shaped platform string and asserts the metric only exposes web + unknown label values (no raw header bleed). - testutil.go gains a small GatherForTest helper so the regression test can pull the typed MetricFamily map without re-implementing the registry-walk dance. Co-authored-by: multica-agent <github@multica.ai> * fix(server): NormalizeTaskSource on workspace_created + document lint limitations (MUL-2949) Final review touch-ups before merge: - IncForEvent's EventWorkspaceCreated case wraps source through NormalizeTaskSource, matching the other source-bearing dispatches (issue_created, agent_created, issue_executed). Closes the last raw property leak in the dispatcher table. - business_pairing_test.go inline docstrings now spell out the two known limitations of the lint gate that 张大彪 / Eve flagged: analyticsBackedIdents matches by ident NAME (not SSA def-use, so a nested-scope shadow could pass) and isMetricsRecordEvent hard-codes the import alias set. PR description carries a Follow-ups section with the same two items so the work is visible after merge. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: 魏和尚 <agent+wei@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-03 16:39:06 +08:00
LinYushen	3943358e67	feat(billing): proxy /api/cloud-billing/* + Stripe webhook to multica-cloud (#3434 )	2026-05-28 16:05:19 +08:00
LinYushen	c968c13c87	feat(auth): support mcn_ Cloud Node PATs verified via Fleet (#3349 ) * feat(auth): support mcn_ Cloud Node PATs verified via Fleet Adds a new token kind, mcn_ (multica cloud node), recognized in both the regular Auth and DaemonAuth middlewares. mcn_ tokens are minted and owned by Multica Cloud (not the local personal_access_tokens table); the server validates them by POSTing to the Fleet's /api/v1/pat/verify endpoint and uses the returned owner_id as X-User-ID for downstream handlers. Cloud is the authoritative owner of token status, so this is a verifier-only path with no DB fallback: * Fleet says valid:false -> 401 (token genuinely bad) * Fleet unreachable / 5xx -> 503 (transient, retry) * No MULTICA_CLOUD_FLEET_URL configured -> 401 (fail closed) Verification results are cached in Redis for 60s under mul:auth:mcn:<sha256> to bound the per-request load on Fleet without extending the revocation window beyond what the Cloud doc allows. Negative results are NOT cached, so a freshly minted token doesn't get locked out by a stale 'token_not_found'. Reuses MULTICA_CLOUD_FLEET_URL (the same env the cloud-runtime proxy already uses) so deployments don't need a second config knob. Tests cover the happy path, every documented invalid reason, 4xx/5xx mapping, network error, decode error, ctx cancellation, the fail-closed valid:true-without-owner_id case, trailing-slash URL normalization, and the Redis cache short-circuit + negative no-cache contract. Middleware tests pin the four 401/503/200 outcomes in both Auth and DaemonAuth. * auth(mcn): require owner_id to map to a real local user; drop X-User-PAT plumbing Two related changes: 1. Cloud-verified owner_id is now checked against our local users table. The Cloud owner_id and our users.id share the same UUID space by contract; a missing local user means either the row was deleted under an active node or something is forging owner_ids — either way, fail closed. CloudPATVerifier.Verify takes a new OwnerLookupFunc: - returns (true, nil) -> success, cache + return - returns (false, nil) -> ErrCloudPATInvalid (reason='owner_unknown'), NOT cached (so a freshly-created user doesn't get locked out for a TTL window) - returns (_, error) -> ErrCloudPATUnavailable (transient, middleware emits 503) Both Auth and DaemonAuth wire ownerLookupFor(queries), a new shared helper that wraps queries.GetUser, mapping pgx.ErrNoRows / unparseable UUIDs to (false, nil) and other errors to a real Go error. 2. Removed all X-User-PAT plumbing. Cloud now mints node-scoped mcn_ PATs itself during /api/v1/nodes (see multica-cloud docs/api/node-pat.md) and ships them into the EC2 instance via SSM, so multica-api no longer needs to forward the caller's mul_ PAT. Propagating a long-lived user PAT into a remote machine widened the blast radius of any node compromise; that's gone now. Removed: - cloud_runtime.go: withUserPAT option, cloudRuntimeUserPAT, generateCloudRuntimePAT, revokeGeneratedPAT - cloudruntime/Request.UserPAT field + X-User-PAT header - X-User-PAT from CORS allowed headers - obsolete handler tests: TestCreateCloudRuntimeNodeForwardsValidatedPAT TestCreateCloudRuntimeNodeRejectsUnownedPAT TestCreateCloudRuntimeNodeRejectsExpiredPAT TestCreateCloudRuntimeNodeAutoGeneratesPAT replaced with TestCreateCloudRuntimeNodeForwardsBody - X-User-PAT references in packages/core/api/client.test.ts Tests: * 3 new verifier-level tests (owner_unknown not cached, lookup error -> Unavailable, success path is cached for both fleet AND lookup) * 5 new owner_lookup_test.go tests (nil queries, existing user, missing user, malformed UUID, DB error) * 1 new end-to-end DaemonAuth test (cloud says valid, no local user -> 401) * Existing X-User-PAT TS assertions removed; full vitest run passes. * go test ./... and go vet ./... clean on the server module.	2026-05-27 14:52:03 +08:00
Multica Eve	41cb91abd9	feat: add cloud runtime fleet proxy API (MUL-2453) (#2986 ) * feat: add cloud runtime fleet proxy API Co-authored-by: multica-agent <github@multica.ai> * test: cover cloud runtime handler nits Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 15:06:10 +08:00

4 Commits