multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 13:29:44 +02:00

Author	SHA1	Message	Date
Multica Eve	da624a8835	feat: add configurable S3 path-style addressing (#4739 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-07-02 15:01:45 +08:00
MeloMei	ff286dcfac	MUL-3848: fix(server): skip CLIENT SETNAME for managed Redis compatibility Closes #4627	2026-06-30 15:51:45 +08:00
Naiyuan Qing	b336f07617	Revert "feat(analytics): anonymous self-host onboarding source beacon (MUL-37…" (#4712 ) This reverts commit `63eb6f73ad`.	2026-06-29 19:01:14 +08:00
Naiyuan Qing	63eb6f73ad	feat(analytics): anonymous self-host onboarding source beacon (MUL-3708) (#4691 ) * feat(analytics): anonymous self-host onboarding source beacon (MUL-3708) Production self-host servers now report the anonymous onboarding "how did you hear about us" channel to Multica's public write-only ingest, so the self-host source distribution becomes visible alongside official cloud. Official cloud keeps its existing PostHog capture unchanged; this is a submit-time beacon, not a background telemetry pipeline. - server/internal/sourcebeacon: ShouldSend gate (production + non-local + non-.multica.ai app host, fail-closed — judged by the app/frontend host, not the backend URL, which official often leaves unset), per-instance salted hashing, deterministic event uuid, fire-and-forget sender. - POST /api/telemetry/self-host-source: public, write-only, per-IP rate-limited, 4 KiB body cap, channel allowlist, strict unknown-field rejection. Lands in PostHog as self_host_source_channel with a deterministic uuid (best-effort dedup), $process_person_profile=false, and deployment=self_host — a distinct event name so it never pollutes the official onboarding funnel. - Hook in PatchOnboarding fires once when the source is first set; never blocks onboarding. Only channel enum(s) + two per-instance hashes leave the box — never user_id/email/name/workspace/org/domain/role/use_case/the source_other free-text/IP. - migration 128: system_settings singleton holding instance_salt. - frontend: self-host-only anonymous-collection notice on the source step, gated by a new /api/config self_host_source_notice flag (en/zh-Hans/ko/ja). - analytics.Event gains an optional top-level uuid; docs/analytics.md, SELF_HOSTING.md and .env.example document exactly what is/isn't sent and how to disable it (ANALYTICS_DISABLED). Also fixes the long-standing team_size→source drift in docs/analytics.md. Verified locally: go build/vet, go test (sourcebeacon, analytics, handler), pnpm typecheck (all packages), locale parity (157), step-source (6) + core config/schema (69) vitest, lint (0 errors). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> fix(analytics): wire self-host source beacon through metrics, guard nil pool (MUL-3708) Addresses Howard CI blockers on #4691 (no product-direction change): - loadInstanceSalt returns "" on nil pool; salt is only loaded when ShouldSendFromEnv() is true, via a bounded (5s) context — restores the "router constructible without a DB" invariant (nil-pool routing tests). - Add multica_self_host_source_channel_total counter (by source) + an IncForEvent case, so every analytics event is paired with a Prometheus counter. NormalizeSourceChannel reuses sourcebeacon allowlist (no 3rd copy). - Beacon handler now builds the event via the analytics.SelfHostSourceChannel helper and ships it through obsmetrics.RecordEvent (no naked Capture); not IsMetricsOnly, so it still reaches PostHog. - Prime the new family in the registry-families test. Verified: go build/vet, go test ./internal/metrics ./internal/sourcebeacon ./internal/handler ./cmd/server (incl. the 3 named blockers + registry + record-event-helper lints) all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-29 15:56:16 +08:00
Willow Lopez	af34b8f83a	feat(lark): add proxy support for WebSocket connections (#4166 ) Add ProxyURL field to GorillaDialer so deployments behind a corporate proxy can route Lark WebSocket connections through an HTTP CONNECT proxy. - GorillaDialer.ProxyURL: optional proxy URL parsed and applied to the underlying gorilla/websocket dialer before each DialContext call. Empty value preserves the default ProxyFromEnvironment behaviour. - Router reads MULTICA_LARK_WS_PROXY_URL env var and sets it on the production dialer. - Three new unit tests cover invalid URL, proxy-applied, and empty-URL default paths. Closes #4032 Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 15:12:53 +08:00
Multica Eve	4a8210912a	feat(featureflag): framework-level feature flag system (MUL-3615) (#4496 ) * feat(featureflag): framework-level feature flag system (MUL-3615) Introduces a reusable feature flag framework so future features can adopt flags without writing infrastructure code. Backend: server/pkg/featureflag (Go) - Service / Provider / Decision separation per Martin Fowler's Toggle Point / Toggle Router / Toggle Configuration pattern. - Providers: StaticProvider (rules in source control), EnvProvider (FF_<KEY> overrides for ops kill switches), ChainProvider (first-hit-wins composition). - EvalContext carried through context.Context with WithEvalContext / EvalContextFrom; supports user_id, workspace_id, free-form attributes. - PercentRollout via deterministic FNV-1a bucketing; same user always lands in the same bucket so experiments do not flap between requests. - Nil-safe Service: a nil Service or missing flag returns the caller's default so business code never panics on a missing flag. - 100% unit-test coverage with -race; go vet clean. Frontend: packages/core/feature-flags (TypeScript) - Same vocabulary as the Go side (Decision, EvalContext, Rule, PercentRollout). FNV-1a parity ensures cross-tier bucket agreement. - FeatureFlagService + StaticProvider + ChainProvider in pure TS. - React glue: FeatureFlagsProvider, useFlag(key, default), useVariant(key, default). Hooks fall back to the default when no provider is mounted so Storybook / unit tests stay simple. - Vitest tests for service, providers, hash, and React hooks. Docs: docs/feature-flags.md — wiring, EvalContext, toggle points, backend-protection note, and the standard best-practice checklist. The framework intentionally has no third-party Go deps and no API surface beyond what real callers will need. New providers (DB, remote config, LaunchDarkly) plug in by implementing Provider; no existing caller has to change. Co-authored-by: multica-agent <github@multica.ai> fix(featureflag): cross-tier hash parity + variant only when enabled (MUL-3615) Two must-fix issues from the PR review on #4496: 1. TS hash had a trailing zero separator that Go did not emit, so the same (key, identifier) bucketed differently on the two tiers. The "user lands in the same bucket on server and client" promise was broken. For example billing_new_invoice/user-42 was bucket 97 in Go and bucket 11 in TS. Fix: TS fnv1a now emits the zero separator BETWEEN parts only, never after the last one, matching Go's hash.Write byte stream exactly. Verified by parallel golden tests on both sides that pin five (key, identifier) -> bucket triples; if either side drifts both tests fail and one must be brought back in sync. 2. StaticProvider returned `Rule.Variant` regardless of whether the rule evaluated to enabled=true. A 0%-rollout user, a deny-listed user, or a default-off user would see variant="experiment-v2", so callers branching on Variant() would route control users into the experiment arm. Fix: Rule.Variant is now the ON-variant only. When the rule evaluates to enabled=false the Decision's variant is the canonical "off", regardless of what Rule.Variant says. Documented as a behavior contract in the Rule godoc / JSDoc and covered by regression tests on both sides. Tests: - go test -race ./pkg/featureflag/... : all green (1.58s). - pnpm --filter @multica/core test : 661/661 (3 new). - pnpm --filter @multica/core typecheck: clean. Co-authored-by: multica-agent <github@multica.ai> * fix(featureflag): hash UTF-8 bytes on the TS side for cross-tier parity (MUL-3615) Follow-up review on PR #4496 caught that the previous hash fix was only correct for ASCII input. The TS side used `charCodeAt`, which returns UTF-16 code units, while the Go side hashes the UTF-8 byte representation. Any non-ASCII flag key or identifier — Chinese flag names, accented user IDs, emoji — would bucket differently on backend vs frontend, silently breaking the "same user, same bucket" promise the PR description makes. Concretely: flag/é Go 53 vs TS-old 68 flag/🦄 Go 82 vs TS-old 75 实验/user-1 Go 90 vs TS-old 4 flag/用户-1 Go 95 vs TS-old 2 Fix: replace per-char charCodeAt with a module-level `TextEncoder` ('utf-8') and hash each encoded byte. After the fix all four cases above match Go exactly, and the existing ASCII cases continue to match. The cross-language golden tables on both sides now include the 5 new non-ASCII cases alongside the 5 ASCII cases, so any future regression that swaps UTF-8 for charCodeAt (or vice versa) will fail loudly on both Go and TS simultaneously. TextEncoder is part of WHATWG Encoding and is available in every evergreen browser, in Node 11+, and in Hermes (React Native) >= 0.74, which covers every runtime that imports @multica/core/feature-flags. Tests: - go test -race ./pkg/featureflag/... : all green. - pnpm --filter @multica/core test : 661/661. - pnpm --filter @multica/core typecheck : clean. Co-authored-by: multica-agent <github@multica.ai> * feat(featureflag): wire into main app config — YAML file + env override (MUL-3615) Follow-up requested by Yushen on PR #4496: make the feature flag framework configurable through the existing main-program config system instead of requiring Go code edits. multica's main app is purely env-var driven (see .env.example) with optional MULTICA_*_FILE knobs for richer config; feature flags now follow the same pattern. server/pkg/featureflag/config.go - LoadRulesFromYAMLFile(path) parses a YAML rule set into runtime Rule structs. Empty files are a valid "no flags yet" state; missing or malformed files surface a hard error so operators see misconfig the same way DATABASE_URL parse errors do. - NewServiceFromEnv composes the standard provider chain: 1. EnvProvider("FF_") (runtime kill-switch path) 2. StaticProvider from YAML file (declarative rule set) When MULTICA_FEATURE_FLAGS_FILE is unset, only the env layer is active and every IsEnabled call falls through to the caller's default, so the server can boot before any flag is authored. server/cmd/server/main.go - Construct the Service once at startup right after env-var warnings, fail loudly on malformed YAML, log the loaded rule count via the Service logger. The Service is held in a local `flags` variable ready to be threaded into handler.Handler / service constructors when the first flag user lands. Threading is deferred to the PR that adds the first business consumer so this PR stays a pure framework + config layer. .env.example - New "Feature flags" section documents MULTICA_FEATURE_FLAGS_FILE and the FF_<KEY> override convention, with a minimal YAML schema example inline. docs/feature-flags.md - Replace the "build a provider manually" example with the NewServiceFromEnv pattern that now matches what main.go actually does. Show the YAML schema in one place. Note the on-variant / off semantics from the previous review round. server/pkg/featureflag/doc.go - Update package doc to mention the gopkg.in/yaml.v3 dependency (already a server-level dep) instead of the now-inaccurate "no third-party dependencies" claim. Tests: - go test -race -count=1 ./pkg/featureflag/... all green; new config_test.go covers: simple YAML, full-shape YAML, empty file, missing file, malformed YAML, no env var, file-only, env-beats-file, bad file surfaces error. - go test -race -count=1 -run TestHealth ./cmd/server/... sanity check that the main.go boot path with the new wiring still passes. - go vet ./... clean. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 13:49:59 +08:00
Bohan Jiang	4df6c1468d	fix: validate selfhost compose env defaults (#4138 ) Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-15 15:43:10 +08:00
Bohan Jiang	6ac8314711	feat(lark): support both Feishu and Lark from one deployment (MUL-3083) (#3815 ) * feat(lark): serve Feishu and Lark from one deployment, per installation The Lark integration was locked to a single open-platform host chosen deployment-wide (MULTICA_LARK_HTTP_BASE_URL / _CALLBACK_BASE_URL, defaulting to open.feishu.cn), so one deployment could talk to only the mainland Feishu cloud OR Lark international — never both. Teams on the other tenant could not use the integration at all. Make the host per-installation. The device-flow installer already auto-detects the tenant (Lark emits tenant_brand="lark" mid-poll); we now persist that as lark_installation.region, carry it on InstallationCredentials.Region, and resolve the open-platform host per call (REST + WS bootstrap) from the region. An explicit cfg.BaseURL (env / httptest) still overrides every region, so existing tests and staging/proxy setups keep working. - migration 116: lark_installation.region TEXT NOT NULL DEFAULT 'feishu' CHECK (region IN ('feishu','lark')) — existing rows are all mainland. - lark.Region enum + OpenPlatformBaseURL/RegionOrDefault helpers. - registration: thread the detected region into finishSuccess so the install-time GetBotInfo hits the right cloud AND the row records it. - every credential-build site (patcher, replier, WS provider, union_id backfill) copies region off the installation row. - region is part of the WS supervisor fingerprint so a re-install that switches cloud restarts the connection. - API: surface region on the installation listing DTO. MUL-3083 Co-authored-by: multica-agent <github@multica.ai> * feat(lark): surface installation region in settings UI Read the per-installation region off the listings response: build the "Manage in Lark" dev-console host from it (open.feishu.cn vs open.larksuite.com instead of a hardcoded mainland host) and render a Feishu / Lark badge on each connected bot. The field is optional and defaults to Feishu when an older server omits it (API-compat). Adds the region_feishu / region_lark labels to all four locales. MUL-3083 Co-authored-by: multica-agent <github@multica.ai> * docs(lark): document simultaneous Feishu + Lark support The cloud each bot belongs to is now auto-detected at install and stored per installation, so one deployment serves both. Replace the old "point MULTICA_LARK_HTTP_BASE_URL at larksuite for international tenants" guidance (now just an optional override) in all four locales. MUL-3083 Co-authored-by: multica-agent <github@multica.ai> * fix(lark): repair legacy Lark-international installs on upgrade Review follow-up (MUL-3083). Migration 116 backfilled every existing lark_installation to region='feishu', assuming all historical rows were mainland. But self-host deployments could already run Lark international via the deployment-wide MULTICA_LARK_HTTP_BASE_URL override, so those rows are really Lark — clearing the override after upgrade (which the new docs invite) would route them to open.feishu.cn and break them. Add a one-shot startup repair, BackfillRegionFromLegacyOverride, fired off the hot path like BackfillBotUnionIDs: when the deployment's global base-URL override targets open.larksuite.com, relabel the still-default 'feishu' rows to 'lark'. Gating on the deployment-wide override is what makes it safe — every pre-existing install on such a deployment was Lark. Idempotent; no-op on mainland / fresh deployments. Verified end-to-end against a scratch DB (flip then 0-row idempotent re-run). Also document that a Lark/飞书 app_id is globally unique across both clouds, which is what makes the app_id-keyed token cache and the UNIQUE(app_id) constraint safe across regions (review nit). MUL-3083 Co-authored-by: multica-agent <github@multica.ai> * docs(lark): fix ops guidance to match auto per-installation region Review follow-up (MUL-3083). .env.example and docker-compose.selfhost.yml still told operators that international Lark requires pointing both base URLs at open.larksuite.com — now wrong, and it would push a fresh deployment back into a single-cloud override. Rewrite them: the base URLs are optional deployment-wide overrides; normal dual-cloud operation keeps them empty. Document the first-boot auto-relabel for deployments migrating off the old single-cloud override, across the integration docs (en/zh/ja/ko). MUL-3083 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-05 16:03:13 +08:00
Multica Eve	905ebbdde1	fix(github): populate connected account name on install [MUL-3078] (#3811 ) * fix(github): populate connected account name on install [MUL-3078] The Settings → GitHub connection card was rendering 'Connected to unknown' because: 1. fetchInstallationAccount in the setup callback hit GitHub's /app/installations/{id} endpoint unauthenticated. That endpoint requires App JWT auth; the call returned 401, and the function fell through to the 'unknown' placeholder which was persisted as account_login. 2. The installation webhook handler did upsert the row with the real login when GitHub later delivered installation.created, but it never published a github_installation:created event. The frontend query stayed stale, so the UI kept showing 'unknown' even after the row had been refreshed. Fix: - Add optional GITHUB_APP_ID + GITHUB_APP_PRIVATE_KEY env vars. When set, signGitHubAppJWT mints a short-lived RS256 JWT (back-dated 60s for clock skew, capped at 9m to stay inside GitHub's 10m max) and fetchInstallationAccount uses it as a Bearer token. The setup callback now writes the real org/user name on install. - When the new env vars are not configured, the call still falls through to 'unknown' as before — but the webhook handler now publishes EventGitHubInstallationCreated after the upsert, so the realtime listener invalidates the installations query and the UI converges to the real value within seconds, no manual refresh. Tests cover JWT signing (claims, signature, malformed PEM, partial config), fetchInstallationAccount with a JWT-gated httptest mock, and the webhook refresh + broadcast on a seeded 'unknown' row. Docs updated for .env.example and github-integration / environment-variables in en, zh, ja, ko. Co-authored-by: multica-agent <github@multica.ai> * test(github): defuse JWT clock-bomb by injecting parser time [MUL-3078] PR review caught that TestSignGitHubAppJWT_ClaimsAndSignature signed the token with a fixed 'now' (2026-06-05T12:00:00Z) but parsed it with a default jwt.Parse, which uses real time.Now() for exp validation. Once real wall clock crossed the token's exp (now + 9m = 12:09:00Z), the test would have flipped to a deterministic failure on every CI run. Inject the same fixed 'now' into the parser via jwt.WithTimeFunc so both signing and validation share one clock. Verified independently that without the fix the parser rejects the token as 'expired', and with the fix it accepts. Also clarified the fetchInstallationAccount comment to be unambiguous about what 'do not block' actually means: the HTTP call IS synchronous (no independent timeout, pre-existing wart), but a failure here just falls back to the unknown placeholder rather than aborting the callback. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-05 15:21:22 +08:00
Alex	4da43b383f	fix: selfhost env does not accept LARK related env (MUL-3060) (#3771 ) * fix: selfhost docker compose env does not accept LARK related env * fix(selfhost): pass through MULTICA_LARK_CALLBACK_BASE_URL for international Lark The inbound long-conn callback bootstrap reads MULTICA_LARK_CALLBACK_BASE_URL (server/cmd/server/router.go buildLarkConnectorFactory -> HTTPConnectionTokenFetcher), which defaults to open.feishu.cn with no fallback to MULTICA_LARK_HTTP_BASE_URL. Without it forwarded into the backend container, international Lark tenants can send (outbound HTTP via MULTICA_LARK_HTTP_BASE_URL) but never receive messages — the bootstrap still hits the mainland host. Forward the var in docker-compose.selfhost.yml and document all three Lark knobs in .env.example so operators can discover them from the standard 'cp .env.example .env' onboarding path. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 19:49:50 +08:00
Multica Eve	ae27058b0a	fix(attachments): unified download endpoint with mode + presign + proxy (MUL-2976) (#3747 ) Fix attachment download for self-hosted deployments using private S3-compatible buckets without CloudFront. Closes #3721. Server - New unified `GET /api/attachments/{id}/download` endpoint that picks CloudFront / S3 presign / server proxy at request time. - `ATTACHMENT_DOWNLOAD_MODE=auto\|cloudfront\|presign\|proxy` and `ATTACHMENT_DOWNLOAD_URL_TTL` env knobs; `auto` routes Docker hostnames / localhost / private IPs through the proxy and public S3 endpoints through presign. - `Storage.PresignGet` capability; S3 implementation generates presigned GET URLs. - `attachmentToResponse` returns the unified relative endpoint instead of leaking raw unsigned S3 URLs when CloudFront is not configured. Proxy path streams via `io.Copy` with `Content-Disposition` / `Content-Length` / `Cache-Control: no-store` / `X-Content-Type-Options: nosniff`. Clients - CLI / Desktop / Mobile resolve relative `download_url` values against the configured API base. Desktop covers the Electron native download bridge and the media preview modal; Mobile covers `Linking.openURL`, the markdown image RN loader, and the composer's completed non-image file chip. - Mobile gains a minimal Node-environment vitest lane wired into `mobile-verify.yml`. Docs - `.env.example`, `docker-compose.selfhost.yml`, `SELF_HOSTING_ADVANCED.md`, and the `environment-variables` doc set updated with the new env keys and the `ATTACHMENT_DOWNLOAD_MODE=proxy` recommendation for Docker / VPC-internal object stores. Tests - `internal/storage`, `internal/cli`, `internal/handler` (download endpoint, mode selection, proxy header, `/content` non-regression), `cmd/server` (trusted proxy parser). - `packages/views/editor/use-download-attachment.test.tsx` and `attachment-preview-modal.test.tsx` exercise relative URL resolution + absolute pass-through. - `apps/mobile/lib/attachment-url.test.ts` covers every helper branch plus the composer non-image chip case.	2026-06-04 14:52:57 +08:00
Bohan Jiang	8db619c1cd	fix(email): wire SMTP_EHLO_NAME through self-host config + docs [MUL-2984] (#3749 ) * fix(email): wire SMTP_EHLO_NAME through self-host config + docs Follow-up to #3679, which added SMTP_EHLO_NAME in code but never exposed it to operators. - docker-compose.selfhost.yml: pass SMTP_EHLO_NAME through to the backend container. The compose env block is an explicit allowlist, so without this the override set in .env was silently dropped and never reached the process — making the escape hatch unusable on the docker path. - Document the var alongside its SMTP_* siblings: .env.example, SELF_HOSTING_ADVANCED.md, environment-variables.mdx, auth-setup.mdx, and self-host-quickstart.mdx (the last two with a strict-relay example). - email.go: log when os.Hostname() fails instead of silently falling back to net/smtp's lazy "localhost" — the exact greeting strict relays reject. - Add TestNewEmailService_EHLOName covering the env override, trimming, and the hostname fallback. MUL-2984 Co-authored-by: multica-agent <github@multica.ai> * fix(email): gate EHLO resolution to SMTP mode + sync docs to zh/ja/ko Addresses review nits on this PR: - email.go: resolve smtpEHLOName only when SMTP_HOST is set, so the Resend / DEV-stdout paths never call os.Hostname() or emit its failure log. The EHLO name is only ever used on the SMTP send path. - docs: add SMTP_EHLO_NAME to the zh/ja/ko variants of environment-variables, self-host-quickstart, and auth-setup, in sync with the English docs updated earlier in this PR. Note: the ja/ko self-host-quickstart and auth-setup pages were already missing the port-465 implicit-TLS example (pre-existing i18n drift from an earlier SMTP_TLS change, unrelated to this PR); the new EHLO block is inserted at the correct logical anchor regardless. A full ja/ko re-sync is left as a separate follow-up. MUL-2984 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 14:44:55 +08:00
fengchangguo-star	2cf8107fc8	feat(email): support implicit TLS (SMTPS/465) for SMTP relay (MUL-2768) (#3340 ) * feat(email): support implicit TLS (SMTPS/465) for SMTP relay The SMTP relay previously only did opportunistic STARTTLS: it dialed plaintext and upgraded if the server advertised STARTTLS. Providers that only offer implicit TLS on port 465 and do not advertise STARTTLS (e.g. Aliyun enterprise mail) could not be used as a relay at all. Add an SMTP_TLS env var: - unset / starttls (default): unchanged STARTTLS-upgrade behavior. - implicit / smtps / ssl: dial with tls.DialWithDialer (SMTPS). Implicit TLS is auto-enabled when SMTP_PORT=465 and SMTP_TLS is unset, so the common case works with no extra config. The startup log line now reports the negotiated mode (starttls / implicit-tls). Co-authored-by: multica-agent <github@multica.ai> * feat(email): plumb SMTP_TLS through selfhost compose, warn on unknown values The backend reads SMTP_TLS but docker-compose.selfhost.yml never forwarded it, so SMTP_TLS=implicit on a non-standard port (or an explicit starttls override on 465) silently did nothing inside the container. Add it to the backend.environment block. Also log a one-line warning when SMTP_TLS is set to an unrecognized value (e.g. "tls"/"true"/"on"), which would otherwise fall through to STARTTLS and fail to dial a 465 SMTPS port with no startup hint. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(email): cover SMTP_TLS precedence and alias resolution Table-driven test over NewEmailService asserting the implicit-TLS decision: 465 auto-enables implicit; explicit starttls on 465 overrides auto-detect; implicit/smtps/ssl aliases (case-insensitive, whitespace-trimmed) force SMTPS on any port; unknown values fall back to starttls. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs: document SMTPS / SMTP_TLS support, drop "465 unsupported" Port 465 implicit TLS is now supported, so the five places that said it was unsupported are wrong. Replace those sentences, add an SMTP_TLS row to the environment-variables tables (EN + ZH), and add a copy-pasteable SMTPS env block to the auth-setup pages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: guofengchang <guofengchang@cumulon.com> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 18:15:04 +08:00
Ivan Vinokurov	9aa8ba0191	fix(runtimes): self-host daemon setup URLs (MUL-2804) (#3474 ) Expose self-host daemon setup URLs from /api/config at runtime so the Add computer dialog renders the operator's own server/app domains, while Multica Cloud defaults stay unchanged. Fixes #3013.	2026-05-30 18:13:02 +08:00
Bohan Jiang	90ddfb04e2	feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777) (#3441 ) * feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777, #3433) When self-hosters set DISABLE_WORKSPACE_CREATION=true, POST /api/workspaces returns 403 for every caller and the UI hides every "Create workspace" affordance (sidebar, modal, /workspaces/new page, onboarding Step 2). This closes the gap where ALLOW_SIGNUP=false still let any signed-in user open an isolated workspace the platform admin couldn't see. - server: new Config.DisableWorkspaceCreation, gate in CreateWorkspace, workspace_creation_disabled in /api/config, Go tests. - frontend: new workspaceCreationDisabled in configStore, hide sidebar entry, swap NewWorkspacePage / CreateWorkspaceModal / onboarding StepWorkspace to a "creation disabled, ask for invite" state when the flag is on, EN + zh-Hans locale strings. - ops: .env.example, docker-compose.selfhost, helm values + configmap, SELF_HOSTING.md, SELF_HOSTING_ADVANCED.md, environment-variables docs (EN + zh). Co-authored-by: multica-agent <github@multica.ai> * fix(onboarding): drive create path off workspaceCreationAllowed (#3433) PR #3441 review: when DISABLE_WORKSPACE_CREATION=true and the user already has a workspace, StepWorkspace still walked the resume copy (`headline_resume` / `lede_resume` mentioning "or start another") and `creatingActive` ignored the flag, leaving a stale clickable create CTA possible if /api/config arrived late. Refactor StepWorkspace to derive a single `workspaceCreationAllowed` boolean from the config store. It now drives: - Initial `mode` state (defaults to "existing" when disabled + reusing so the CTA is pre-armed for the only valid action). - `creatingActive` so the footer CTA cannot fall back into the create branch even mid-render. - Eyebrow / headline / lede strings — adds `creation_disabled_{eyebrow,headline,lede}_resume` (EN + zh-Hans) for the disabled + reusing variant. Tests: cover the three reachable shapes — flag off + no existing, flag on + no existing, flag on + existing. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 16:42:08 +08:00
YOMXXX	bfb7c85491	fix(selfhost): derive local port URLs from env (MUL-2506) (#2939 ) * fix(selfhost): derive local port URLs from env * fix(selfhost): derive local script URLs	2026-05-24 13:05:53 +08:00
Jiayuan Zhang	cd37b4e3d6	feat(settings): consolidate GitHub options under a dedicated Settings tab (MUL-2414)	2026-05-19 17:23:30 +02:00
Kagura	59617f376e	feat(auth): make auth token TTL configurable via AUTH_TOKEN_TTL env var (MUL-2371) (#2713 ) * feat(auth): make auth token TTL configurable via AUTH_TOKEN_TTL env var Add AUTH_TOKEN_TTL environment variable (in seconds) to override the hardcoded 30-day auth token lifetime. Self-hosted deployments on trusted networks can set a longer value to avoid frequent magic-link re-authentication. The value is read once at startup and cached. Invalid or missing values fall back to the 30-day default with a warning log. Closes #2685 * refactor(auth): extract parseAuthTokenTTL for testability Address review feedback: extract pure parse function from sync.Once wrapper so the parsing logic can be unit-tested independently. Add TestParseAuthTokenTTL with table-driven cases. Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com> * refactor(auth): accept Go duration strings + hoist shared TTL in SetAuthCookies Address nice-to-have review feedback from Bohan-J: - parseAuthTokenTTL now tries time.ParseDuration first (e.g. '8760h'), falling back to ParseInt for integer seconds - Warn on unreasonable values (>10 years) but still accept them - Hoist AuthTokenTTL() and time.Now() in SetAuthCookies so both cookies share the exact same expiry - Add security trade-off note in .env.example - Add 5 new test cases for duration strings Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com> Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com> * fix: use AuthTokenTTL() in CloudFront middleware, guard ParseInt overflow Address review feedback from Bohan-J (round 2): 1. CloudFront refresh middleware (cloudfront.go:21) was hardcoding 3024time.Hour instead of using auth.AuthTokenTTL(). Now calls AuthTokenTTL() so the middleware respects AUTH_TOKEN_TTL env var. 2. parseAuthTokenTTL integer-seconds branch: very large values like 9999999999 would silently overflow int64 when multiplied by time.Second. Added overflow guard comparing against math.MaxInt64/int64(time.Second) before the multiplication. 3. Updated AuthTokenTTL() doc comment to reflect that it accepts Go duration strings or integer seconds (not just seconds). 4. Added middleware test (cloudfront_test.go) verifying short AUTH_TOKEN_TTL produces short cookie expiry, not 30-day hardcode. Also covers nil signer and existing-cookie-skip cases. 5. Added integer overflow test case to cookie_test.go. * style: run gofmt on cookie.go and cookie_test.go --------- Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com> Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>	2026-05-19 16:22:07 +08:00
Bohan Jiang	eb5c6d7547	docs(self-host): document auth rate-limit env keys (#2773 ) Adds REDIS_URL, RATE_LIMIT_AUTH, RATE_LIMIT_AUTH_VERIFY, and RATE_LIMIT_TRUSTED_PROXIES to the environment-variables page (EN + ZH) and to .env.example, with the reverse-proxy caveat that without RATE_LIMIT_TRUSTED_PROXIES every user shares the proxy IP and the whole deployment ends up in one bucket. Follow-up to #2636. MUL-2251. Co-authored-by: multica-agent <github@multica.ai>	2026-05-18 13:11:17 +08:00
johnhu-1237	79dd066363	fix env example websocket origin (#2599 )	2026-05-18 12:38:52 +08:00
Kerim Incedayi	9418d2a2c1	feat(autopilots): webhook triggers (server + CLI + UI + docs) MUL-2049 (#2348 ) * feat(server): add webhook trigger DB migration + sqlc queries Lays the foundation for webhook autopilot triggers: - partial unique index on autopilot_trigger.webhook_token (kind=webhook only) so the public ingress route can resolve a trigger in O(1) - GetWebhookTriggerByToken / TouchAutopilotTriggerFiredAt / RotateAutopilotTriggerWebhookToken / SetAutopilotTriggerWebhookToken queries, regenerated with sqlc * feat(server): webhook token generator + payload normalizer Two pure helpers for the webhook autopilot work: - generateWebhookToken: 32 random bytes -> base64-url, "awt_" prefix. 256 bits of entropy keeps brute-force off the table; the prefix makes leaked tokens recognisable in logs. - normalizeWebhookPayload: turns arbitrary JSON into the WebhookEnvelope shape (event/eventPayload/request) used by trigger_payload. Header- and body-based event inference covers GitHub, GitLab, X-Event-Type, and caller-provided envelopes; scalar/empty/invalid bodies are rejected so the handler can answer 400. * feat(server): generate webhook tokens and expose rotate endpoint - New handler.Config.PublicURL fed by MULTICA_PUBLIC_URL env so /api/autopilots/.../triggers responses can include an absolute webhook_url alongside the always-present webhook_path. - CreateAutopilotTrigger now mints a webhook_token via crypto/rand for kind=webhook and ignores cron/timezone for non-schedule kinds. api triggers stay accepted-but-inert per PLAN.md. - New POST /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token protected by the existing workspace auth group; old tokens stop working immediately because the unique-index lookup keys on the current row value. * feat(server): public webhook ingress route + per-token rate limiter - New POST /api/webhooks/autopilots/{token} route, mounted outside the authenticated group: the path token is the credential. Workspace context is derived from the joined autopilot row, never headers. - Body capped at 256 KiB via http.MaxBytesReader; oversized payloads return 413 mid-read instead of being fully buffered. - Disabled triggers / paused / archived autopilots return 200 {"status":"ignored"} so providers stop retrying. - Skipped-runtime dispatches surface 200 {"status":"skipped"} with the reason from the autopilot service's pre-flight admission check. - WebhookRateLimiter interface with sliding-window in-memory + Redis Lua-script implementations. Default 60 req/min per token. Test coverage on the in-memory path; Redis variant fails open on cache errors so a Redis hiccup never blocks ingress. - Integration tests exercise token generation, dispatch, payload envelope persistence, GitHub-header inference, paused/disabled short-circuits, oversized rejection, and rotate-then-old-token-404. * feat(server): include webhook payload in create_issue description When an autopilot run is triggered by a webhook and execution_mode is create_issue, the agent only sees the issue body — never the run's trigger_payload. Append a 'Webhook event:' line and a fenced JSON block with the normalized eventPayload so the agent has the inbound context inline. Schedule / manual runs are unchanged. Tests cover: - schedule path keeps existing italic note, no webhook block - webhook path emits event line + payload block, italic before block - non-envelope JSON falls back to raw body (defensive) - non-webhook source with payload still gets no webhook block * feat(core): types, API client and mutations for webhook triggers - AutopilotRunStatus gains 'skipped' so the run-list UI handles the admission-skipped state explicitly instead of falling through to a generic case (the backend already emits it via MUL-1899). - AutopilotTrigger picks up optional webhook_path / webhook_url. Both are optional so older self-hosted servers that pre-date this change still parse cleanly. - buildAutopilotWebhookUrl helper composes a usable absolute URL with the priority webhook_url > apiBaseUrl + path > origin + path > path. Tested with seven cases covering each branch. - ApiClient.rotateAutopilotTriggerWebhookToken posts to /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token; the HTTP-contract test pins URL + method. - useRotateAutopilotTriggerWebhookToken mutation invalidates autopilotKeys.detail on settle, mirroring the existing trigger-mutation pattern. * feat(views): webhook trigger UI in Add Trigger dialog and trigger row Add Trigger dialog gains a Schedule/Webhook segmented toggle: - Schedule reuses TriggerConfigSection unchanged. - Webhook hides the cron config and shows a help line; the trigger is created with kind=webhook and the URL is generated server-side. - Toast text differentiates schedule vs webhook on success. TriggerRow grows a webhook branch: - Webhook icon, kind translated via trigger_kind. - URL shown in a truncating monospace pill, with copy + rotate buttons. Copy uses navigator.clipboard with toast feedback; rotate uses an AlertDialog confirm because the old URL stops working immediately. - api triggers render a Deprecated badge and skip URL/copy/rotate affordances. RunRow gains a 'skipped' RUN_VISUAL entry (muted dash) so admission- skipped runs don't fall through to a generic case. Source label uses the new run_source i18n key instead of capitalize. Locales: en + zh-Hans gain run_status.skipped, run_source., trigger_kind., trigger_row.{copy_url,rotate_url,_confirm_,toast_}, add_trigger_dialog.{type_,webhook_help,toast_added_{schedule,webhook}}. * feat(cli): support webhook trigger creation and URL rotation - multica autopilot trigger-add now takes --kind schedule\|webhook (default schedule for backward compatibility). For webhook it skips --cron / --timezone validation and prints the resulting webhook URL, preferring the server-provided webhook_url and falling back to client.BaseURL + webhook_path. - New multica autopilot trigger-rotate-url <autopilot-id> <trigger-id> command for rotating the bearer URL of a webhook trigger. * docs(autopilots): add webhook trigger guide (en + zh) Replaces the 'Webhook and API triggers are not available yet' section with end-to-end webhook documentation: how the URL is generated, what payload shapes are accepted, the inferred-event rules, the bearer-secret warning + rotate flow, status-code semantics for accepted/skipped/ ignored/4xx/5xx outcomes, and the MULTICA_PUBLIC_URL self-host configuration. Run history list now mentions skipped status. The 'unavailable features' section narrows to api-kind triggers, HMAC signing, IP allowlists, and provider presets. * feat(views): add Schedule/Webhook toggle to the create autopilot dialog Closes the gap where a brand-new autopilot could only be created with a schedule trigger. The right-column config now has a Trigger section with a segmented Schedule/Webhook control: - Schedule keeps the existing cron/timezone UI. - Webhook hides the cron UI and shows a help line; on submit, a kind=webhook trigger is created right after the autopilot. In edit mode the toggle is intentionally hidden (PLAN.md treats trigger- type changes as delete-old + create-new, not in-place updates), but the panel still picks the right kind based on props.triggers[0].kind so a webhook autopilot doesn't render an irrelevant cron form. Locales: section_trigger_kind, trigger_kind_{schedule,webhook}, section_webhook, webhook_help_{create,edit} added in en + zh-Hans. * feat(views): show webhook URL inline after creating a webhook autopilot After a successful create with kind=webhook, the dialog stays open and swaps to a confirmation panel showing the freshly minted URL with a copy button + 'Treat this URL like a password' warning + Done button. Avoids the friction of "create the autopilot, then go find it in the list, click in, scroll to triggers, copy URL." Locales: dialog.webhook_created_{title,description,warning,done} added in en + zh-Hans. Schedule create flow is unchanged (toast + close). The success panel is gated on the trigger returned from the create mutation, so a partial failure (autopilot created, trigger creation errored) still falls through to the toast_create_partial path. * feat(views): show webhook payload in run detail dialog The agent transcript dialog now accepts an optional headerSlot that sits above the event list. The autopilot RunRow drops a WebhookPayloadPreview into that slot when the run came from a webhook and trigger_payload is non-empty. The preview is collapsed by default (the transcript itself is the main event), shows the inferred event name + receivedAt in the header, and reveals the eventPayload as pretty-printed JSON with a copy button on expand. Falls back gracefully if the row's trigger_payload doesn't match the WebhookEnvelope shape — the whole value is shown instead so nothing is hidden. Closes the "agent didn't echo the payload, now I can't see what triggered the run" gap. PLAN.md tracked this as "Payload preview in run history" under follow-ups. Locales: webhook_payload.{label, unknown_event, payload, content_type, copy, copied, copied_short, copy_failed} added in en + zh-Hans. * chore(server): wire MULTICA_PUBLIC_URL through self-host compose Two small follow-ups split out of the webhook trigger PR: - docker-compose.selfhost.yml passes MULTICA_PUBLIC_URL into the backend container so a self-hosted deployment behind a real domain gets absolute webhook URLs in the trigger response. Documented in .env.example with the rationale for not deriving the public host from request headers. - Drop a duplicated 'invalid json:' prefix in the webhook ingress 400 error path. normalizeWebhookPayload already prefixes its errors, so the handler doesn't need to re-prefix. * fix(migrations): renumber webhook trigger migration 081 → 089 to avoid collision The branch's 081_autopilot_webhook_triggers.{up,down}.sql collided numerically with 081_runtime_timezone.{up,down}.sql that landed on main, making migration apply order undefined. Renumber to 089 so the file slots after the latest main migration (088_squad_instructions). The SQL itself doesn't conflict — it only creates a partial unique index on autopilot_trigger.webhook_token — but the duplicate prefix is what the migration runner sees, so the filename must move. * fix(autopilot-webhook): address PR review blocking issues - Redact bearer tokens from request logs: paths matching /api/webhooks/autopilots/<token> now log "[redacted]" instead of the token. The resolved trigger ID is plumbed via context so audit lines stay useful for debugging. (Review item Blocking #1.) - Distinguish pgx.ErrNoRows from transient DB errors in token lookup: no-row stays 404 (so providers don't retry on a deleted webhook), other errors return 500 (which providers DO retry, avoiding silent drops on DB blips). (Review item Blocking #2.) - Add per-IP sliding-window rate limiter that runs BEFORE the token lookup, so spraying random tokens can no longer probe the autopilot_trigger index unboundedly. Reuses the existing Lua script with a separate Redis key namespace; falls open on Redis errors. Default budget 30 req/min/IP. (Review item Blocking #3.) The webhook handler now applies the gates in the order: per-IP rate limit → token lookup → per-token rate limit → handler logic. * fix(autopilot): atomic webhook trigger creation + strict kind/timezone validation - Mint the webhook bearer token BEFORE the INSERT and pass it via CreateAutopilotTriggerParams so the row never exists in a half-written kind=webhook + webhook_token=NULL state. On the (vanishingly rare) unique-index collision the whole INSERT is retried with a fresh token — no UPDATE second step. Removes the now-dead attachFreshWebhookToken helper. (Review item Recommended #4.) - Add new GET /api/autopilots/{id}/runs/{runId} endpoint that returns a single run including the full trigger_payload. The list response is now slim (omits trigger_payload) so worst-case payload size drops from ~5 MB to ~5 KB. (Review item Recommended #5, server side.) - Reject kind=api with 400 ("kind=api is deprecated; use schedule or webhook") and reject kind=webhook with --timezone with 400 — both surfaces stragglers loudly instead of silently dropping fields. CLI mirrors the check so --timezone with --kind webhook errors client-side. (Review nits.) - Add --yes (-y) flag and an interactive y/N confirmation prompt to `multica autopilot trigger-rotate-url` so the destructive rotate matches the UI's AlertDialog safety. (Review item Recommended #6.) * fix(views): fetch webhook payload on-demand and truncate at 4 KiB - Add useAutopilotRun query hook + getAutopilotRun API client method paired with the new server endpoint. The run-detail dialog now mounts a WebhookPayloadSlot that fetches the full run (incl. trigger_payload) lazily — list responses no longer carry up to 256 KiB × N runs of envelope data. - WebhookPayloadPreview truncates its in-DOM <pre> at 4 KiB with a localized marker so jank-y machines aren't asked to render a 256 KiB JSON blob. The Copy button still yields the full string. - Adds the truncated_marker i18n string to en + zh-Hans. Review items Recommended #5 (frontend) and a nit on the preview's unbounded <pre>. * test(autopilot-webhook): close coverage gaps flagged in PR review - request_logger: redactWebhookPath unit tests + integration test proving the bearer token never lands in slog output, plus the webhook_trigger_id context plumbing. - autopilot_webhook_handler: empty body → 400, archived autopilot → 200 ignored, per-IP rate limiter trips before DB lookup, kind=api and webhook+timezone are rejected at 400, slim list + full detail endpoint round-trip. - webhook_rate_limiter: Lua script structure guard (catches reordering even without a live Redis), plus live-Redis tests for both per-token and per-IP limiters (REDIS_TEST_URL gated, matching the existing Redis test pattern in the package). - WebhookPayloadPreview: envelope rendering, fallback shape, and the >4 KiB truncation path with full-payload-on-Copy guarantee. Two branches are documented as code-review-protected rather than covered by tests: the 500-on-DB-error path requires injecting a stub Queries (no interface here), and the cross-workspace defense-in-depth check is unreachable from valid SQL state. * fix(middleware): SetWebhookTriggerID must mutate request in place The round-1 helper returned a fresh http.Request from WithContext, and the webhook handler did `r = SetWebhookTriggerID(r, ...)`. That swaps the handler's local pointer but doesn't propagate the new context back to RequestLogger, which is still holding the original http.Request — so the audit line never actually included webhook_trigger_id in production. The round-1 test happened to pass because it pre-stashed the value on the request before calling ServeHTTP, bypassing the bug it was meant to verify. Switch to in-place mutation via `r = r.WithContext(...)` so the wrapping middleware sees the new context after next.ServeHTTP returns, and update the test to exercise the real call pattern (set the context from inside the handler, assert the surrounding logger reads it). Verified live: an accepted webhook now logs path=/api/webhooks/autopilots/[redacted] webhook_trigger_id=<uuid> * fix(autopilot-webhook): symmetric ErrNoRows split + trusted-proxy gate Round-2 review (Bohan-J, PR #2348 follow-up): - Must-fix #1: the second lookup at autopilot_webhook.go:258 (GetAutopilot after the token resolves) was folding every error into 404. A transient DB blip would tell a webhook sender "not found" and it would never retry. Apply the same errors.Is(err, pgx.ErrNoRows) → 404 / else → 500 split as the first lookup got in round 1. - Must-fix #2: clientIPForRateLimit was honoring X-Forwarded-For / X-Real-IP from any caller. An attacker spraying random tokens could just rotate the XFF header and the per-IP bucket became per-request, so the limiter that's specifically supposed to gate spraying before it hits the DB unique index was bypassed. New shape — matches Bohan's suggestion exactly: * Default: r.RemoteAddr only, headers ignored. * Operator opt-in via MULTICA_TRUSTED_PROXIES (comma-separated CIDRs). XFF/X-Real-IP are honored only when r.RemoteAddr is inside one of the listed prefixes; otherwise they're dropped. Wired through .env.example and docker-compose.selfhost.yml so self-host operators can configure their reverse-proxy's CIDR. Invalid CIDRs in the env var are dropped with a single slog.Warn at startup rather than crashing the server. Uses net/netip (stdlib, value-typed) for parsing and containment checks. Verified live on the rebuilt self-host backend: a 35-request spray from one source with rotating XFF gets the expected 30× 404 + 5× 429, proving the per-IP bucket is keyed on the real connection IP. * fix(autopilot): reject cron/timezone PATCH on non-schedule triggers Round-2 review should-fix. CreateAutopilotTrigger already 400s on kind=webhook + timezone/cron_expression, but UpdateAutopilotTrigger silently wrote those fields regardless of prev.Kind. The values then sat in the DB visible to nobody and read by nothing — a back door that left the API contract fuzzy across create vs update. Mirror the create-path discipline: after loading prev, if prev.Kind != "schedule" and the PATCH body sets cron_expression or timezone, return 400 with a clear message. enabled and label remain accepted on every kind. The existing prev.Kind == "schedule" guard on next_run_at recompute stays as belt-and-braces, but with this gate in place the recompute branch is now reachable only for the kind it was meant for. * test(autopilot-webhook): close round-2 coverage gaps - IPRateLimitNotBypassedByXFFSpoof: drives the must-fix #2 invariant by rotating XFF across three calls from the same RemoteAddr and asserting the third gets 429. Pre-round-2 this test would have passed for the wrong reason (limiter trusted XFF, so per-bucket collision was incidental); now it pins the bypass-closed property. - IPRateLimitReturns429BeforeDBLookup: updated to set RemoteAddr explicitly and drop the XFF header it was leaning on. With TrustedProxies empty (test default) the limiter keys on the real connection IP, which is what the test wants to assert anyway. - UpdateAutopilotTrigger_RejectsCronExpressionOnWebhookKind + UpdateAutopilotTrigger_RejectsTimezoneOnWebhookKind: drive the round-2 should-fix from the handler boundary. - UpdateAutopilotTrigger_AcceptsEnabledAndLabelOnWebhookKind: counter test so a regression to a blanket reject is caught. * fix(migrations): bump webhook trigger migration 089 → 091 origin/main added 089_squad_no_action_activity_index (and 090_task_is_leader) since our last rebase, re-colliding with our 089_autopilot_webhook_triggers. Bump to 091 so the filename ordering is unambiguous again. The SQL is unchanged — same partial unique index on autopilot_trigger.webhook_token — only the filename moves. * fix(views): dedupe skipped icon in autopilot RUN_VISUAL after rebase The rebase against origin/main merged main's add of `Ban` for the skipped status next to our round-1 `MinusCircle` entry, leaving the RUN_VISUAL map with two `skipped` keys (only the last would have been read at runtime, and MinusCircle had been dropped from the imports during conflict resolution — so the file would not compile). Keep main's `Ban` icon (latest design) and a single `skipped` entry. Carry over the round-1 comment about why the muted styling matters for failure-ratio readability. --------- Co-authored-by: Kerim Incedayi <kerim.incedayi@digitalchargingsolutions.com>	2026-05-18 12:17:39 +08:00
Bohan Jiang	a23856bae3	MUL-1624 docs(email): clarify 888888 is opt-in; document SMTP option (#2666 ) * docs(email): clarify 888888 is opt-in via MULTICA_DEV_VERIFICATION_CODE; document SMTP option in self-host docs The startup log line, .env.example, and SELF_HOSTING_ADVANCED.md still implied that the dev master code 888888 is auto-active whenever APP_ENV != "production". That has not been true since the master code was gated behind MULTICA_DEV_VERIFICATION_CODE — the fixed code is disabled by default and must be opted in explicitly. Also extend the docs site with the SMTP relay backend added in #1877: auth-setup, environment-variables, and self-host-quickstart now cover both Resend and SMTP options in EN and ZH. Co-authored-by: multica-agent <github@multica.ai> * docs(email): treat SMTP as an email backend in self-host docs and startup warning Address review feedback on #2666: - server: startup warning now fires only when both RESEND_API_KEY and SMTP_HOST are empty, since either one is a valid email backend. Otherwise the log mis-tells SMTP-only operators that verification codes go to stdout. - self-host-quickstart (EN/ZH): tell readers to fetch the verification code from whichever backend they configured (Resend or SMTP); fall back to stdout only when neither is configured. - auth-setup (EN/ZH): \"without Resend\" → \"without any email backend configured\" so the wording stays correct now that SMTP is a first-class option. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-15 14:18:46 +08:00
apollion69	35e9a7f0f6	feat(email): add SMTP relay as alternative to Resend for self-hosted deployments (#1877 ) * feat(email): add SMTP relay as alternative to Resend Self-hosted deployments often run behind a corporate firewall with an existing SMTP relay (Exchange, Postfix, sendmail) and no access to external SaaS APIs. Resend requires a public domain, an API key, and outbound HTTPS to api.resend.com — all unavailable in air-gapped or private-network setups. This adds a second email delivery path using Go's stdlib net/smtp, activated when SMTP_HOST is set. Priority order: 1. SMTP relay (SMTP_HOST set) 2. Resend API (RESEND_API_KEY set) 3. DEV stdout (neither set) New env vars (all optional, no breaking change): SMTP_HOST — SMTP server hostname SMTP_PORT — port, default 25 SMTP_USERNAME — for authenticated SMTP; empty = unauthenticated relay SMTP_PASSWORD — used only when SMTP_USERNAME is set SMTP_TLS_INSECURE — set to "true" to skip TLS cert verification (for private CA / self-signed certs) The implementation: - Dials TCP, creates smtp.Client manually (avoids smtp.SendMail which does not expose TLS config) - Tries STARTTLS if advertised; uses InsecureSkipVerify only when SMTP_TLS_INSECURE=true (opt-in, nolint:gosec annotated) - Applies PlainAuth only when SMTP_USERNAME is non-empty - Wraps all errors with context for easier debugging - Reuses existing HTML templates from buildInvitationParams for invitation emails (no template duplication) Also updates .env.example and docker-compose.selfhost.yml with the new variables and inline documentation. * fix(email): add dial timeout, session deadline, RFC headers for SMTP path Address review blockers from multica-eve and Bohan-J (PR #1877): - net.Dial → net.DialTimeout(10s) + conn.SetDeadline(30s) so a blackholed SMTP relay cannot hang SendVerificationCode (called synchronously from the auth handler) or leak goroutines in the invitation path. - Add Date, Message-ID, and proper Content-Transfer-Encoding headers. Date is required by RFC 5322; many strict relays reject messages without it. Message-ID aids deliverability and threading. - MIME-encode Subject via mime.QEncoding so non-ASCII workspace/inviter names (CJK, emoji) survive without corruption across any RFC 2047-conformant relay. - Probe 8BITMIME after (possible) STARTTLS: use Content-Transfer-Encoding 8bit when the relay advertises 8BITMIME, quoted-printable otherwise — safe for all relay configurations without forcing base64 overhead. - Update SELF_HOSTING_ADVANCED.md to document Option B (SMTP relay) alongside the existing Resend section, including all five env vars and a note that port 465/SMTPS is not yet supported. * fix(email): correct has8Bit assignment order (bool is first return of Extension)	2026-05-15 13:35:01 +08:00
Bohan Jiang	eca36fac84	fix(github): plumb GITHUB_APP_SLUG / GITHUB_WEBHOOK_SECRET through self-host (#2482 ) The GitHub App integration code reads these two env vars and only enables the Connect flow when both are set. .env.example never listed them, and docker-compose.selfhost.yml did not forward them into the backend container, so self-hosters following the integration docs had no working way to turn the feature on. MUL-2107 Co-authored-by: multica-agent <github@multica.ai>	2026-05-12 18:40:17 +08:00
Multica Eve	ce00e05169	Add canonical PostHog core metrics events (#2302 ) * Add canonical PostHog core metrics events Co-authored-by: multica-agent <github@multica.ai> * Address analytics review feedback Co-authored-by: multica-agent <github@multica.ai> * Tighten analytics review follow-ups Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Devv <devv@Devvs-Mac-mini.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 13:12:00 +08:00
Bohan Jiang	89b939b07c	fix(storage): build region-qualified S3 public URLs (#2051 ) (#2065 ) * fix(storage): build region-qualified S3 public URLs (#2051) The uploadedURL fallback (no CloudFront, no custom endpoint) wrote "https://<bucket>/<key>" — missing the ".s3.<region>.amazonaws.com" suffix — so any deployment that pointed S3_BUCKET at a real AWS bucket without a CDN got broken image URLs back to the client. Avatar URLs were persisted in this broken form on the user/agent rows, so profile pictures uploaded via the SDK never rendered. - Track S3_REGION on S3Storage and emit https://<bucket>.s3.<region>.amazonaws.com/<key> by default; fall back to path-style https://s3.<region>.amazonaws.com/<bucket>/<key> when the bucket name contains dots, since the AWS wildcard cert can't validate dotted virtual-hosted hosts. - Teach KeyFromURL to recognise the new region-qualified hosts (both styles) and keep recognising the legacy bucket-only host so historical records can still be deleted/migrated. - Document that S3_BUCKET is the bucket name only, not a hostname, in env-vars docs (en+zh), self-hosting guides, and .env.example. Co-authored-by: multica-agent <github@multica.ai> * feat(storage): warn at startup when S3_BUCKET looks like a hostname Catches the most common misconfiguration shape (S3_BUCKET set to "<bucket>.s3.<region>.amazonaws.com") with a startup log line so operators don't silently end up with a config that signs uploads against an invalid bucket name. A real bucket name can never legitimately contain "amazonaws.com", so the check is a single substring match — no false positives worth carving out. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-06 12:45:55 +08:00
devv-eve	6ef711cd35	fix: gate dev verification code behind explicit env (#1773 ) * fix: gate dev verification code behind explicit env * docs: fold dev verification code into env table * docs: clarify fixed verification code opt-in --------- Co-authored-by: Eve <eve@multica.ai>	2026-04-28 15:14:07 +08:00
devv-eve	f864a07bd5	feat: add server Prometheus metrics endpoint Add Prometheus metrics endpoint with local-bind listener support and baseline metrics collectors.	2026-04-28 14:29:01 +08:00
LinYushen	99154d97b9	Restrict /health/realtime metrics exposure (MUL-1342) (#1608 ) * Restrict /health/realtime metrics exposure (MUL-1342) The realtime metrics endpoint was registered on the public router with no authentication, exposing per-event/per-scope counters, redis.last_error, and redis.node_id to anonymous callers. This enables information disclosure and traffic profiling. Move the handler behind a token + loopback policy: - If REALTIME_METRICS_TOKEN is set, require Authorization: Bearer <token> using a constant-time compare. Reject other callers with 401 plus a WWW-Authenticate hint. - If the env var is unset, only serve loopback callers and return 404 to remote clients so the endpoint is not enumerable. This keeps local dev workflows working without configuration. The handler is extracted into health_realtime.go with focused unit tests covering the token, loopback, and rejection paths. .env.example documents the new variable. Refs: https://github.com/multica-ai/multica/issues/1606 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fail closed for proxied /health/realtime requests (MUL-1342) Addresses review on PR #1608: when the server runs behind a reverse proxy (Caddy / Nginx -> localhost:8080), public callers reach the Go handler with RemoteAddr=127.0.0.1, so the previous loopback shortcut exposed the metrics surface in self-hosted deployments. The no-token path now treats any forwarding header (X-Forwarded-For / -Host / -Proto, X-Real-Ip, Forwarded) as a 'this request was proxied, can't attribute, fail closed' signal and returns 404. Direct loopback callers without those headers still work for local dev. Token-gated path is unchanged. Tests cover all listed proxy headers (incl. multi-hop XFF chain and RFC 7239 Forwarded) over both 127.0.0.1 and ::1, plus a regression case ensuring an empty/whitespace forwarding header does not break direct loopback access. .env.example updated to call out that proxied deployments must configure REALTIME_METRICS_TOKEN. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: CC-Girl <cc-girl@multica.ai> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-24 14:04:10 +08:00
devv-eve	fbf41bde73	feat(selfhost): ship public GHCR deployment flow Publish stable GHCR self-host images, switch self-host deploys to official image pulls with a source-build fallback, and move self-host signup / Google OAuth config onto runtime /api/config.	2026-04-22 16:58:42 +08:00
devv-eve	637bdc8eb3	feat(analytics): full PostHog pipeline + 6 funnel events (MUL-1122) (#1367 ) * feat(analytics): add PostHog client with async batch shipping Introduces server/internal/analytics, the shipping layer for the product funnel defined in docs/analytics.md. Capture is non-blocking — events are enqueued into a bounded channel and a background worker batches them to PostHog's /batch/ endpoint. A broken backend drops events rather than blocking request handlers. Local dev and self-hosted instances run a noop client until the operator sets POSTHOG_API_KEY. This is PR 1 of MUL-1122; signup and workspace_created emission land in the follow-up commit so this change is independently reviewable. * feat(server): emit signup and workspace_created analytics events Wires analytics.Client through handler.New and main, then emits the first two funnel events: - signup fires from findOrCreateUser (which now reports isNew), covering both the verification-code and Google OAuth entry points — a single emission site guarantees Google signups aren't missed. - workspace_created fires after the CreateWorkspace transaction commits, with is_first_workspace computed from a post-commit ListWorkspaces count so we can distinguish fresh-user activation from returning-user expansion. Tests use analytics.NoopClient so nothing ships from test runs. PR 1 of MUL-1122; runtime_registered and issue_executed follow in later PRs per the plan. * refactor(analytics): drop is_first_workspace from workspace_created Stamping "is this the user's first workspace?" at emit time races under concurrent CreateWorkspace requests: two transactions committing close together can both read a post-commit count greater than one and both emit false. Fixing it at the SQL layer requires a schema change we don't want in PR 1. PostHog answers the same question exactly from the event stream (funnel on "first time user does X" / cohort on $initial_event), so removing the property loses no information and makes the emit side race-free. * docs(analytics): document self-host safety defaults Spell out why self-hosted instances never ship events upstream by default (empty POSTHOG_API_KEY → noop client) and explain how operators can point at their own PostHog project without any code change. * feat(analytics): emit runtime_registered, issue_executed, team_invite_* Three server-side funnel events, all gated on first-time state transitions so retries and re-runs don't inflate the WAW buckets: - runtime_registered fires from DaemonRegister when UpsertAgentRuntime reports (xmax = 0) — i.e. the row was inserted, not updated. Heartbeats and re-registrations stay silent. - issue_executed fires from CompleteTask after an atomic UPDATE issue SET first_executed_at = now() WHERE id = $1 AND first_executed_at IS NULL flips the column for the first time. Retries, re-assignments, and comment-triggered follow-up tasks hit the WHERE clause and no-op. Carries nth_issue_for_workspace so the ≥1/≥2/≥5/≥10 buckets filter without extra queries. - team_invite_sent fires from CreateInvitation and team_invite_accepted from AcceptInvitation, closing the expansion funnel. Adds a 050 migration for issue.first_executed_at plus a partial index so the workspace-scoped executed-count query doesn't scan the never-executed tail. * feat(config): surface PostHog key via /api/config Extends AppConfig with posthog_key / posthog_host sourced from env on every request (so operators can rotate the key via secret refresh without a restart). Reading the key off the server — rather than baking it into the frontend bundle via NEXT_PUBLIC_* — means self-hosted instances inherit the blank key automatically and never ship events upstream. * feat(analytics): wire posthog-js identify + UTM capture on the client Adds @multica/core/analytics — a thin wrapper around posthog-js that owns attribution capture and identity merge. Posthog-js config comes from /api/config (not NEXT_PUBLIC_), so self-hosted instances whose server returns an empty key automatically run the SDK inert. captureSignupSource stamps a multica_signup_source cookie with UTM params and the referrer's origin (never the full referrer — that can leak OAuth code/state in the callback URL). The backend signup event reads this cookie on new-user creation. Identity flows: - auth-initializer fires identify() right after getMe() resolves, on both cookie and token paths. A getConfig/getMe race is handled by buffering a pending identify inside the analytics module and flushing it once initAnalytics finishes. - auth store calls identify() on verifyCode / loginWithGoogle / loginWithToken and resetAnalytics() on logout so the next login merges cleanly without bleeding events. docs(analytics): describe runtime_registered, issue_executed, invite events Fills in the schema for the remaining funnel events. Captures the design commentary that belongs next to the contract rather than in a PR description — in particular why issue_executed uses the atomic first_executed_at flip instead of counting task-terminal events, and why runtime_registered relies on xmax = 0 rather than a query-then-write. * fix(analytics): drop non-atomic nth_issue_for_workspace from issue_executed Computing the workspace's Nth-issue ordinal at emit time is not atomic under concurrent first-completions — two transactions can both run MarkIssueFirstExecuted, then both run CountExecutedIssuesInWorkspace, and both observe count=1 before either has committed, so both events go out stamped as n=1. Serialising it would mean a per-workspace advisory lock or a SERIALIZABLE-isolated tx; PostHog answers the same question exactly at query time via row_number() partitioned by workspace_id, so the emit-time property adds risk without adding information. Removes the property from analytics.IssueExecuted, deletes the unused CountExecutedIssuesInWorkspace query, and regenerates sqlc. The partial index stays — any future workspace-scoped executed-issue query will want it. * fix(analytics): wire $pageview and harden signup_source cookie payload Two frontend fixes from the PR review: - PageviewTracker, mounted under WebProviders, fires capturePageview on every Next.js App Router path / query-string change. Without this the capturePageview helper in @multica/core/analytics was never called and the acquisition funnel's / → signup step was empty. - captureSignupSource now caps each UTM / referrer value at 96 chars before JSON.stringify, and drops the whole cookie when the serialised payload still exceeds 512 chars. Previously the overall slice(0, 256) could leave a half-JSON string on the wire that neither the backend nor PostHog could parse. Both capturePageview and identify now buffer a single pending call when fired before initAnalytics resolves — otherwise the initial "/" pageview and same-turn login identify race the /api/config fetch and get dropped. resetAnalytics clears both buffers so a logout→login cycle stays clean. * fix(analytics): URL-decode signup_source cookie on read Go does not URL-decode Cookie.Value automatically, so the frontend's JSON-then-encodeURIComponent payload was landing in PostHog as percent-encoded garbage (%7B%22utm_source...). Unescape on read so the backend receives the original JSON string the frontend intended, and drop values that fail to decode or exceed the server-side cap — sending truncated garbage is worse than sending nothing. Oversized-cookie guard matches the frontend's SIGNUP_SOURCE_MAX_LEN. * docs(analytics): reflect nth-issue drop, $pageview wiring, cookie encoding Pulls the schema doc back in line with the code: issue_executed no longer advertises nth_issue_for_workspace (with a note about why PostHog derives it at query time instead), the frontend $pageview section names the actual PageviewTracker component that fires it, and the signup_source section documents the per-value cap / overall drop rule and the encode-on-write / decode-on-read contract. --------- Co-authored-by: Jiang Bohan <bhjiang@outlook.com>	2026-04-21 14:42:52 +08:00
Bohan Jiang	824d943848	fix(auth): derive cookie Secure flag from FRONTEND_ORIGIN scheme (#1390 ) The session cookie's Secure flag was tied to APP_ENV, and the docker-compose self-host stack defaults APP_ENV to "production". On plain-HTTP self-host deployments (LAN IP, private network) the browser silently drops Secure cookies, leaving every subsequent /api/* call anonymous and surfacing as 401 "auth: no token found" right after a successful login. Derive Secure from the scheme of FRONTEND_ORIGIN so HTTPS origins get Secure cookies and plain-HTTP origins get non-secure cookies the browser will actually store. Also harden cookieDomain() against the other common trap: COOKIE_DOMAIN=<ip>, which RFC 6265 forbids and browsers reject. Log a one-shot warning and fall back to host-only. Docs: correct the COOKIE_DOMAIN description (it was labelled as CloudFront-only but applies to session cookies too) and call out the IP-literal pitfall in SELF_HOSTING_ADVANCED.md, self-hosting.mdx, and .env.example. Refs #1321	2026-04-20 19:53:15 +08:00
LinYushen	07034f4455	feat(server): configurable pgxpool size with sane defaults (#1381 ) * feat(server): configurable pgxpool size with sane defaults pgxpool.New(ctx, url) silently sets MaxConns = max(4, NumCPU). On the prod pods that resolved to 4, which got fully saturated by daemon claim/heartbeat traffic (~3800 acquires/s) and showed up as ~900ms acquire waits on every query — the actual root cause of the 3s+ /tasks/claim tail latency. The db pool stats logging from #1378 confirmed this with empty_acquire_delta == acquire_count_delta. Switch to pgxpool.ParseConfig + NewWithConfig and apply per-pod defaults of MaxConns=25 / MinConns=5, both overridable via env vars (DATABASE_MAX_CONNS / DATABASE_MIN_CONNS) so the size can be tuned in prod without a redeploy. The defaults follow the standard 'small pool, lots of waiters' guidance for Postgres (PG community / HikariCP formula `(core_count * 2) + effective_spindle_count`); 25 leaves headroom for bursts and occasional long queries while staying safely under typical managed-Postgres max_connections ceilings when multiplied across pods. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(server): respect DATABASE_URL pool_* params; add precedence tests Address review feedback on #1381: - Configuration precedence is now explicit: DATABASE_MAX_CONNS env > pool_max_conns query param on DATABASE_URL > built-in default. Same for min_conns. Previously the env-empty path unconditionally overwrote whatever ParseConfig had read from the URL — a silent regression for deployments that already tuned pool size via the connection string. - Add unit tests in dbstats_test.go covering each precedence branch (defaults, URL-only, env-over-URL, partial URL, invalid env, min>max clamp). - Move pool tuning vars out of 'Required Variables' into a new 'Database Pool Tuning (Optional)' section in SELF_HOSTING_ADVANCED.md so self-hosters don't think they need to set them. - Add commented entries in .env.example. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(server): invalid pool env falls back to URL/code default, never pgx 4 Address second round of review on #1381: Previous code passed cfg.MaxConns / cfg.MinConns as the envInt32 fallback, which meant an invalid DATABASE_MAX_CONNS value silently fell back to ParseConfig's value — i.e. pgx's built-in default of 4/0 when the URL had no pool_* params. That's exactly the bad value this PR exists to remove, and the previous test (TestPoolSizing_InvalidEnvFallsBack) accidentally locked it in. Compute the non-env fallback first (URL pool_* if present, else code default 25/5) and pass that to envInt32. Misconfigured env now lands on the same value as if the env were unset — never on the pgx default. Replace the loose 'max > 0' assertion with two precise tests: - invalid env + no URL param → code default (25/5) - invalid env + URL param → URL value Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-20 17:07:19 +08:00
Bohan Jiang	96ee5bba52	docs(selfhost): surface APP_ENV + 888888 gating in .env.example (#1361 ) The v0.2.6 self-host security fix (#1307) defaults APP_ENV to "production" in docker-compose.selfhost.yml, which disables the 888888 master verification code. The follow-up docs PR (#1313) covered SELF_HOSTING.md and the installers, but `.env.example` — the file users actually copy and edit — still makes no mention of APP_ENV, so operators who don't read the prose docs hit the exact same "888888 stopped working after upgrade" confusion reported in #1331. - Add APP_ENV= to .env.example with a comment block that spells out the three cases (Docker default, local dev, evaluation) and warns against enabling the bypass on public instances. Keeping the value empty preserves the current `make dev` UX (Go server reads empty → treats as non-production → 888888 works locally) while `${APP_ENV:-production}` in the compose file still ensures public Docker deployments are safe by default. - Update the existing 888888 mention under # Email so it no longer contradicts the new gating rule. - Update the `make selfhost` post-start banner, which still told operators to "Log in with any email + verification code: 888888" even after #1307 disabled that path by default.	2026-04-20 13:26:42 +08:00
Azaan Ali Raza	b428f36ca6	feat: add ALLOW_SIGNUP + ALLOWED_EMAIL_* for self-hosted instances (#1098 ) Closes #930 - Added environment variables to control signups - Updated frontend to hide signup text when disabled - Added backend check to block new user creation via magic link - Updated .env.example	2026-04-19 21:02:42 -07:00
LinYushen	5dad1f0915	fix(selfhost): clear hardcoded NEXT_PUBLIC_API_URL/WS_URL defaults (#1063 ) The .env.example had hardcoded http://localhost:8080 defaults for NEXT_PUBLIC_API_URL and NEXT_PUBLIC_WS_URL. When users copied .env.example to .env and customized the backend port, the old defaults would still get baked into the frontend at docker build time via NEXT_PUBLIC_WS_URL build arg, causing API/WebSocket connection failures. With empty defaults: - Docker selfhost: frontend uses relative paths, Next.js rewrites proxy to backend internally — works regardless of external port config - Local dev (make dev): Makefile sets these to localhost:$PORT automatically - Browser fallback: deriveWsUrl() auto-derives WebSocket URL from page origin when NEXT_PUBLIC_WS_URL is empty Closes #1055 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:56:30 +08:00
LinYushen	c0db3e0e76	Revert "feat(selfhost): add single-domain Caddy setup (#899 )" (#1062 ) This reverts commit `100146c49e`.	2026-04-15 14:44:47 +08:00
KimSeongJun	100146c49e	feat(selfhost): add single-domain Caddy setup (#899 ) * selfhost: add single-domain caddy setup * fix(selfhost): address Caddy review feedback	2026-04-14 20:20:26 -07:00
LinYushen	95bfd7dd96	feat(auth): migrate auth token to HttpOnly Cookie & WebSocket Origin whitelist (#819 ) * feat(auth): migrate auth token to HttpOnly cookie & implement WebSocket Origin whitelist Security improvements from the MUL-566 audit report: 1. Auth token is now set as an HttpOnly, SameSite=Lax cookie on login, preventing XSS-based token theft. Cookie-based auth includes CSRF protection via double-submit cookie pattern. The Authorization header path is preserved for Electron desktop app and CLI/PAT clients. 2. WebSocket upgrader now validates the Origin header against a configurable allowlist (ALLOWED_ORIGINS env var), rejecting connections from unauthorized origins. Backend: new auth cookie helpers, middleware reads cookie as fallback, WS handler accepts cookie auth, Origin whitelist, logout endpoint. Frontend: CSRF token in API headers, cookie-aware auth store and WS client, web app opts into cookieAuth mode while desktop keeps tokens. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(auth): address PR review — Strict cookies, HMAC-bound CSRF, origin sync 1. SameSite=Lax → SameSite=Strict per spec requirement 2. CSRF token now HMAC-signed with auth token (nonce.signature format), preventing subdomain cookie injection attacks 3. allowedWSOrigins uses atomic.Value to eliminate data race 4. Removed magic "cookie" sentinel string in WSProvider — pass null token and guard with boolean check instead 5. Removed dead delete uploadHeaders["Content-Type"] in API client Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 12:13:35 +08:00
Antar Das	d32c419b6d	feat(storage): add local file storage fallback (#710 ) * feat(storage): add local file storage fallback - Add local storage implementation for file uploads - Update .env.example with LOCAL_UPLOAD_DIR and LOCAL_UPLOAD_BASE_URL - Integrate local storage into server router and handlers - Add storage abstraction layer with util functions * ♻️ refactor(storage): improve path handling and file serving switch from path to filepath for better cross-platform support and replace manual file serving logic with http.ServeFile to enhance security against path traversal. update unit tests to use t.Setenv for cleaner environment variable management.	2026-04-12 14:04:22 +08:00
Jiayuan Zhang	f100b5b707	fix(auth): graceful email degradation for self-hosting (#742 ) * fix(auth): log email send errors and gracefully degrade in non-production In non-production environments (APP_ENV != "production"), if sending the verification code email fails, log the error as a warning and still return success. This lets self-hosting users log in with the master code (888888) even when their Resend configuration is incomplete (e.g. unverified from-domain). In production, the behavior is unchanged — email failures return 500. Also adds guidance in .env.example about RESEND_FROM_EMAIL for self-hosters. Closes #723 * fix(auth): remove APP_ENV degradation, keep error logging only Remove the APP_ENV-based graceful degradation for email send failures — it's risky if users forget to set APP_ENV=production. Instead, always return 500 on email failure (safe for production) and rely on the error log (slog.Error) with the actual Resend error for debugging. Self-hosters who don't need real emails should leave RESEND_API_KEY empty (codes print to stdout, master code 888888 works).	2026-04-12 02:30:01 +08:00
Jiang Bohan	14fe8e9df9	feat(auth): add Google OAuth login Support Google login that links to existing accounts by email. When a user who registered via email OTP signs in with Google using the same email, they are linked to the same account. Backend: - Add POST /auth/google endpoint that exchanges Google auth code for tokens, fetches user profile, and calls findOrCreateUser() - Updates user name and avatar from Google profile on first Google login Frontend: - Add "Continue with Google" button on login page (shown when NEXT_PUBLIC_GOOGLE_CLIENT_ID is configured) - Add /auth/callback page to handle Google OAuth redirect - Add loginWithGoogle to auth store and API client	2026-04-07 15:25:26 +08:00
Jiang Bohan	89bedb8f5c	feat(web): support REMOTE_API_URL env for proxying to remote backend - Load root .env in next.config.ts so REMOTE_API_URL is available - Default fallback remains localhost:8080 (no impact on existing setups) - Add REMOTE_API_URL to .env.example with documentation	2026-03-31 16:53:32 +08:00
yushen	29a80e057e	feat(upload): add file upload API with S3 + CloudFront signed cookies Add POST /api/upload-file endpoint that uploads files to S3 and returns CDN URLs protected by CloudFront signed cookies (same pattern as Linear). Infrastructure: - Two private S3 buckets (static.multica.ai, static-staging.multica.ai) - Two CloudFront distributions with OAC and Trusted Key Groups - ACM wildcard cert in us-east-1, DNS records in Route 53 - RSA signing key stored in AWS Secrets Manager Backend: - S3 storage service with CloudFront CDN domain support - CloudFront signed cookie generation (RSA-SHA1) - Private key loaded from Secrets Manager (env var fallback for local dev) - Cookies set on login (VerifyCode) with 72h expiry matching JWT - Upload handler: multipart form → S3 → CloudFront URL response Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 14:41:17 +08:00
LinYushen	5c9c2f69fd	feat(auth): email verification login and personal access tokens * feat(auth): add email verification login flow with 401 auto-redirect Replace the old OAuth-based login with email verification codes: - Backend: send-code / verify-code endpoints, verification_codes table (migration 009), rate limiting, Resend email service - Frontend: two-step login UI (email → 6-digit OTP), auth store with sendCode/verifyCode - SDK: ApiClient gains onUnauthorized callback; 401 responses auto-clear token and redirect to /login - Fix login button staying disabled due to global isLoading state Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(auth): add brute-force protection, redirect loop guard, and expired code cleanup - VerifyCode: increment attempts on wrong code, reject after 5 failed tries (migration 010) - onUnauthorized: skip redirect if already on /login to prevent infinite loops - SendCode: best-effort cleanup of expired verification codes older than 1 hour Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(auth): add master verification code for non-production environments Allow code "888888" to bypass email verification in non-production environments to simplify development and testing workflows. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(auth): add personal access tokens for CLI and API authentication Add full-stack PAT support: users create tokens in Settings, CLI authenticates via `multica auth login`. Server stores SHA-256 hashes only. Auth middleware extended to accept both JWTs and PATs (distinguished by `mul_` prefix). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 14:32:30 +08:00
Jiayuan Zhang	2c28c4cba2	refactor(dev): share postgres across main and worktrees	2026-03-24 14:27:35 +08:00
Jiayuan Zhang	cdfa63af15	feat(runtime): add local codex daemon pairing	2026-03-24 12:03:14 +08:00
Jiayuan Zhang	81e64e9fce	Add workspace management and isolated worktree environments	2026-03-23 18:12:11 +08:00
Jiayuan Zhang	d4f5c5b16f	feat: pivot to AI-native task management platform (#232 ) Replace the agent framework codebase with a new monorepo structure for an AI-native Linear-like product where agents are first-class citizens. New architecture: - server/ — Go backend (Chi + gorilla/websocket + sqlc) - API server with REST routes for issues, agents, inbox, workspaces - WebSocket hub for real-time updates - Local daemon entry point for agent runtime connection - PostgreSQL migration with 13 tables (issue, agent, inbox, etc.) - WebSocket protocol types for server<->daemon communication - apps/web/ — Next.js 16 frontend - Dashboard layout with sidebar navigation - Route skeleton: inbox, issues, agents, board, settings - packages/ui/ — Preserved shadcn/ui design system (26+ components) - packages/types/ — Full API contract types (Issue, Agent, Workspace, Inbox, Events) - packages/sdk/ — REST ApiClient + WebSocket WSClient - packages/store/ — Zustand stores (issue, agent, inbox, auth) - packages/hooks/ — React hooks (useIssues, useAgents, useInbox, useRealtime) - packages/utils/ — Shared utilities Removed: apps/cli, apps/desktop, apps/mobile, apps/gateway, packages/core, skills/, and all agent-framework code. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-20 17:55:49 +08:00
Jiayuan Zhang	005908710e	chore: add local dev script for Gateway + Desktop with Telegram bot Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 00:39:43 +08:00

1 2

52 Commits