* feat(lark): serve Feishu and Lark from one deployment, per installation
The Lark integration was locked to a single open-platform host chosen
deployment-wide (MULTICA_LARK_HTTP_BASE_URL / _CALLBACK_BASE_URL,
defaulting to open.feishu.cn), so one deployment could talk to only the
mainland Feishu cloud OR Lark international — never both. Teams on the
other tenant could not use the integration at all.
Make the host per-installation. The device-flow installer already
auto-detects the tenant (Lark emits tenant_brand="lark" mid-poll); we now
persist that as lark_installation.region, carry it on
InstallationCredentials.Region, and resolve the open-platform host per
call (REST + WS bootstrap) from the region. An explicit cfg.BaseURL
(env / httptest) still overrides every region, so existing tests and
staging/proxy setups keep working.
- migration 116: lark_installation.region TEXT NOT NULL DEFAULT 'feishu'
CHECK (region IN ('feishu','lark')) — existing rows are all mainland.
- lark.Region enum + OpenPlatformBaseURL/RegionOrDefault helpers.
- registration: thread the detected region into finishSuccess so the
install-time GetBotInfo hits the right cloud AND the row records it.
- every credential-build site (patcher, replier, WS provider, union_id
backfill) copies region off the installation row.
- region is part of the WS supervisor fingerprint so a re-install that
switches cloud restarts the connection.
- API: surface region on the installation listing DTO.
MUL-3083
Co-authored-by: multica-agent <github@multica.ai>
* feat(lark): surface installation region in settings UI
Read the per-installation region off the listings response: build the
"Manage in Lark" dev-console host from it (open.feishu.cn vs
open.larksuite.com instead of a hardcoded mainland host) and render a
Feishu / Lark badge on each connected bot. The field is optional and
defaults to Feishu when an older server omits it (API-compat). Adds the
region_feishu / region_lark labels to all four locales.
MUL-3083
Co-authored-by: multica-agent <github@multica.ai>
* docs(lark): document simultaneous Feishu + Lark support
The cloud each bot belongs to is now auto-detected at install and stored
per installation, so one deployment serves both. Replace the old
"point MULTICA_LARK_HTTP_BASE_URL at larksuite for international tenants"
guidance (now just an optional override) in all four locales.
MUL-3083
Co-authored-by: multica-agent <github@multica.ai>
* fix(lark): repair legacy Lark-international installs on upgrade
Review follow-up (MUL-3083). Migration 116 backfilled every existing
lark_installation to region='feishu', assuming all historical rows were
mainland. But self-host deployments could already run Lark international
via the deployment-wide MULTICA_LARK_HTTP_BASE_URL override, so those
rows are really Lark — clearing the override after upgrade (which the new
docs invite) would route them to open.feishu.cn and break them.
Add a one-shot startup repair, BackfillRegionFromLegacyOverride, fired
off the hot path like BackfillBotUnionIDs: when the deployment's global
base-URL override targets open.larksuite.com, relabel the still-default
'feishu' rows to 'lark'. Gating on the deployment-wide override is what
makes it safe — every pre-existing install on such a deployment was Lark.
Idempotent; no-op on mainland / fresh deployments. Verified end-to-end
against a scratch DB (flip then 0-row idempotent re-run).
Also document that a Lark/飞书 app_id is globally unique across both
clouds, which is what makes the app_id-keyed token cache and the
UNIQUE(app_id) constraint safe across regions (review nit).
MUL-3083
Co-authored-by: multica-agent <github@multica.ai>
* docs(lark): fix ops guidance to match auto per-installation region
Review follow-up (MUL-3083). .env.example and docker-compose.selfhost.yml
still told operators that international Lark requires pointing both base
URLs at open.larksuite.com — now wrong, and it would push a fresh
deployment back into a single-cloud override. Rewrite them: the base
URLs are optional deployment-wide overrides; normal dual-cloud operation
keeps them empty. Document the first-boot auto-relabel for deployments
migrating off the old single-cloud override, across the integration docs
(en/zh/ja/ko).
MUL-3083
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* fix(github): populate connected account name on install [MUL-3078]
The Settings → GitHub connection card was rendering 'Connected to
unknown' because:
1. fetchInstallationAccount in the setup callback hit GitHub's
/app/installations/{id} endpoint unauthenticated. That endpoint
requires App JWT auth; the call returned 401, and the function
fell through to the 'unknown' placeholder which was persisted as
account_login.
2. The installation webhook handler did upsert the row with the real
login when GitHub later delivered installation.created, but it
never published a github_installation:created event. The frontend
query stayed stale, so the UI kept showing 'unknown' even after
the row had been refreshed.
Fix:
- Add optional GITHUB_APP_ID + GITHUB_APP_PRIVATE_KEY env vars. When
set, signGitHubAppJWT mints a short-lived RS256 JWT (back-dated 60s
for clock skew, capped at 9m to stay inside GitHub's 10m max) and
fetchInstallationAccount uses it as a Bearer token. The setup
callback now writes the real org/user name on install.
- When the new env vars are not configured, the call still falls
through to 'unknown' as before — but the webhook handler now
publishes EventGitHubInstallationCreated after the upsert, so the
realtime listener invalidates the installations query and the UI
converges to the real value within seconds, no manual refresh.
Tests cover JWT signing (claims, signature, malformed PEM, partial
config), fetchInstallationAccount with a JWT-gated httptest mock,
and the webhook refresh + broadcast on a seeded 'unknown' row.
Docs updated for .env.example and github-integration /
environment-variables in en, zh, ja, ko.
Co-authored-by: multica-agent <github@multica.ai>
* test(github): defuse JWT clock-bomb by injecting parser time [MUL-3078]
PR review caught that TestSignGitHubAppJWT_ClaimsAndSignature signed the
token with a fixed 'now' (2026-06-05T12:00:00Z) but parsed it with a
default jwt.Parse, which uses real time.Now() for exp validation. Once
real wall clock crossed the token's exp (now + 9m = 12:09:00Z), the
test would have flipped to a deterministic failure on every CI run.
Inject the same fixed 'now' into the parser via jwt.WithTimeFunc so
both signing and validation share one clock. Verified independently
that without the fix the parser rejects the token as 'expired', and
with the fix it accepts.
Also clarified the fetchInstallationAccount comment to be unambiguous
about what 'do not block' actually means: the HTTP call IS synchronous
(no independent timeout, pre-existing wart), but a failure here just
falls back to the unknown placeholder rather than aborting the
callback.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
* fix: selfhost docker compose env does not accept LARK related env
* fix(selfhost): pass through MULTICA_LARK_CALLBACK_BASE_URL for international Lark
The inbound long-conn callback bootstrap reads MULTICA_LARK_CALLBACK_BASE_URL
(server/cmd/server/router.go buildLarkConnectorFactory ->
HTTPConnectionTokenFetcher), which defaults to open.feishu.cn with no
fallback to MULTICA_LARK_HTTP_BASE_URL. Without it forwarded into the
backend container, international Lark tenants can send (outbound HTTP via
MULTICA_LARK_HTTP_BASE_URL) but never receive messages — the bootstrap
still hits the mainland host.
Forward the var in docker-compose.selfhost.yml and document all three
Lark knobs in .env.example so operators can discover them from the
standard 'cp .env.example .env' onboarding path.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
Fix attachment download for self-hosted deployments using private S3-compatible buckets without CloudFront. Closes#3721.
**Server**
- New unified `GET /api/attachments/{id}/download` endpoint that picks CloudFront / S3 presign / server proxy at request time.
- `ATTACHMENT_DOWNLOAD_MODE=auto|cloudfront|presign|proxy` and `ATTACHMENT_DOWNLOAD_URL_TTL` env knobs; `auto` routes Docker hostnames / localhost / private IPs through the proxy and public S3 endpoints through presign.
- `Storage.PresignGet` capability; S3 implementation generates presigned GET URLs.
- `attachmentToResponse` returns the unified relative endpoint instead of leaking raw unsigned S3 URLs when CloudFront is not configured. Proxy path streams via `io.Copy` with `Content-Disposition` / `Content-Length` / `Cache-Control: no-store` / `X-Content-Type-Options: nosniff`.
**Clients**
- CLI / Desktop / Mobile resolve relative `download_url` values against the configured API base. Desktop covers the Electron native download bridge and the media preview modal; Mobile covers `Linking.openURL`, the markdown image RN loader, and the composer's completed non-image file chip.
- Mobile gains a minimal Node-environment vitest lane wired into `mobile-verify.yml`.
**Docs**
- `.env.example`, `docker-compose.selfhost.yml`, `SELF_HOSTING_ADVANCED.md`, and the `environment-variables` doc set updated with the new env keys and the `ATTACHMENT_DOWNLOAD_MODE=proxy` recommendation for Docker / VPC-internal object stores.
**Tests**
- `internal/storage`, `internal/cli`, `internal/handler` (download endpoint, mode selection, proxy header, `/content` non-regression), `cmd/server` (trusted proxy parser).
- `packages/views/editor/use-download-attachment.test.tsx` and `attachment-preview-modal.test.tsx` exercise relative URL resolution + absolute pass-through.
- `apps/mobile/lib/attachment-url.test.ts` covers every helper branch plus the composer non-image chip case.
* fix(email): wire SMTP_EHLO_NAME through self-host config + docs
Follow-up to #3679, which added SMTP_EHLO_NAME in code but never exposed
it to operators.
- docker-compose.selfhost.yml: pass SMTP_EHLO_NAME through to the backend
container. The compose env block is an explicit allowlist, so without
this the override set in .env was silently dropped and never reached
the process — making the escape hatch unusable on the docker path.
- Document the var alongside its SMTP_* siblings: .env.example,
SELF_HOSTING_ADVANCED.md, environment-variables.mdx, auth-setup.mdx,
and self-host-quickstart.mdx (the last two with a strict-relay example).
- email.go: log when os.Hostname() fails instead of silently falling back
to net/smtp's lazy "localhost" — the exact greeting strict relays reject.
- Add TestNewEmailService_EHLOName covering the env override, trimming,
and the hostname fallback.
MUL-2984
Co-authored-by: multica-agent <github@multica.ai>
* fix(email): gate EHLO resolution to SMTP mode + sync docs to zh/ja/ko
Addresses review nits on this PR:
- email.go: resolve smtpEHLOName only when SMTP_HOST is set, so the
Resend / DEV-stdout paths never call os.Hostname() or emit its
failure log. The EHLO name is only ever used on the SMTP send path.
- docs: add SMTP_EHLO_NAME to the zh/ja/ko variants of
environment-variables, self-host-quickstart, and auth-setup, in sync
with the English docs updated earlier in this PR.
Note: the ja/ko self-host-quickstart and auth-setup pages were already
missing the port-465 implicit-TLS example (pre-existing i18n drift from
an earlier SMTP_TLS change, unrelated to this PR); the new EHLO block is
inserted at the correct logical anchor regardless. A full ja/ko re-sync
is left as a separate follow-up.
MUL-2984
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(email): support implicit TLS (SMTPS/465) for SMTP relay
The SMTP relay previously only did opportunistic STARTTLS: it dialed
plaintext and upgraded if the server advertised STARTTLS. Providers that
only offer implicit TLS on port 465 and do not advertise STARTTLS (e.g.
Aliyun enterprise mail) could not be used as a relay at all.
Add an SMTP_TLS env var:
- unset / starttls (default): unchanged STARTTLS-upgrade behavior.
- implicit / smtps / ssl: dial with tls.DialWithDialer (SMTPS).
Implicit TLS is auto-enabled when SMTP_PORT=465 and SMTP_TLS is unset, so
the common case works with no extra config. The startup log line now
reports the negotiated mode (starttls / implicit-tls).
Co-authored-by: multica-agent <github@multica.ai>
* feat(email): plumb SMTP_TLS through selfhost compose, warn on unknown values
The backend reads SMTP_TLS but docker-compose.selfhost.yml never forwarded
it, so SMTP_TLS=implicit on a non-standard port (or an explicit starttls
override on 465) silently did nothing inside the container. Add it to the
backend.environment block.
Also log a one-line warning when SMTP_TLS is set to an unrecognized value
(e.g. "tls"/"true"/"on"), which would otherwise fall through to STARTTLS
and fail to dial a 465 SMTPS port with no startup hint.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* test(email): cover SMTP_TLS precedence and alias resolution
Table-driven test over NewEmailService asserting the implicit-TLS decision:
465 auto-enables implicit; explicit starttls on 465 overrides auto-detect;
implicit/smtps/ssl aliases (case-insensitive, whitespace-trimmed) force SMTPS
on any port; unknown values fall back to starttls.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
* docs: document SMTPS / SMTP_TLS support, drop "465 unsupported"
Port 465 implicit TLS is now supported, so the five places that said it was
unsupported are wrong. Replace those sentences, add an SMTP_TLS row to the
environment-variables tables (EN + ZH), and add a copy-pasteable SMTPS env
block to the auth-setup pages.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: guofengchang <guofengchang@cumulon.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Expose self-host daemon setup URLs from /api/config at runtime so the Add computer dialog renders the operator's own server/app domains, while Multica Cloud defaults stay unchanged.
Fixes#3013.
* feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777, #3433)
When self-hosters set DISABLE_WORKSPACE_CREATION=true, POST /api/workspaces
returns 403 for every caller and the UI hides every "Create workspace"
affordance (sidebar, modal, /workspaces/new page, onboarding Step 2). This
closes the gap where ALLOW_SIGNUP=false still let any signed-in user open
an isolated workspace the platform admin couldn't see.
- server: new Config.DisableWorkspaceCreation, gate in CreateWorkspace,
workspace_creation_disabled in /api/config, Go tests.
- frontend: new workspaceCreationDisabled in configStore, hide sidebar
entry, swap NewWorkspacePage / CreateWorkspaceModal / onboarding
StepWorkspace to a "creation disabled, ask for invite" state when the
flag is on, EN + zh-Hans locale strings.
- ops: .env.example, docker-compose.selfhost, helm values + configmap,
SELF_HOSTING.md, SELF_HOSTING_ADVANCED.md, environment-variables docs
(EN + zh).
Co-authored-by: multica-agent <github@multica.ai>
* fix(onboarding): drive create path off workspaceCreationAllowed (#3433)
PR #3441 review: when DISABLE_WORKSPACE_CREATION=true and the user already
has a workspace, StepWorkspace still walked the resume copy (`headline_resume`
/ `lede_resume` mentioning "or start another") and `creatingActive` ignored
the flag, leaving a stale clickable create CTA possible if /api/config
arrived late.
Refactor StepWorkspace to derive a single `workspaceCreationAllowed`
boolean from the config store. It now drives:
- Initial `mode` state (defaults to "existing" when disabled + reusing so
the CTA is pre-armed for the only valid action).
- `creatingActive` so the footer CTA cannot fall back into the create
branch even mid-render.
- Eyebrow / headline / lede strings — adds
`creation_disabled_{eyebrow,headline,lede}_resume` (EN + zh-Hans) for
the disabled + reusing variant.
Tests: cover the three reachable shapes — flag off + no existing, flag on
+ no existing, flag on + existing.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
* feat(auth): make auth token TTL configurable via AUTH_TOKEN_TTL env var
Add AUTH_TOKEN_TTL environment variable (in seconds) to override the
hardcoded 30-day auth token lifetime. Self-hosted deployments on trusted
networks can set a longer value to avoid frequent magic-link
re-authentication.
The value is read once at startup and cached. Invalid or missing values
fall back to the 30-day default with a warning log.
Closes#2685
* refactor(auth): extract parseAuthTokenTTL for testability
Address review feedback: extract pure parse function from sync.Once
wrapper so the parsing logic can be unit-tested independently.
Add TestParseAuthTokenTTL with table-driven cases.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* refactor(auth): accept Go duration strings + hoist shared TTL in SetAuthCookies
Address nice-to-have review feedback from Bohan-J:
- parseAuthTokenTTL now tries time.ParseDuration first (e.g. '8760h'),
falling back to ParseInt for integer seconds
- Warn on unreasonable values (>10 years) but still accept them
- Hoist AuthTokenTTL() and time.Now() in SetAuthCookies so both
cookies share the exact same expiry
- Add security trade-off note in .env.example
- Add 5 new test cases for duration strings
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>
* fix: use AuthTokenTTL() in CloudFront middleware, guard ParseInt overflow
Address review feedback from Bohan-J (round 2):
1. CloudFront refresh middleware (cloudfront.go:21) was hardcoding
30*24*time.Hour instead of using auth.AuthTokenTTL(). Now calls
AuthTokenTTL() so the middleware respects AUTH_TOKEN_TTL env var.
2. parseAuthTokenTTL integer-seconds branch: very large values like
9999999999 would silently overflow int64 when multiplied by
time.Second. Added overflow guard comparing against
math.MaxInt64/int64(time.Second) before the multiplication.
3. Updated AuthTokenTTL() doc comment to reflect that it accepts
Go duration strings or integer seconds (not just seconds).
4. Added middleware test (cloudfront_test.go) verifying short
AUTH_TOKEN_TTL produces short cookie expiry, not 30-day hardcode.
Also covers nil signer and existing-cookie-skip cases.
5. Added integer overflow test case to cookie_test.go.
* style: run gofmt on cookie.go and cookie_test.go
---------
Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Adds REDIS_URL, RATE_LIMIT_AUTH, RATE_LIMIT_AUTH_VERIFY, and
RATE_LIMIT_TRUSTED_PROXIES to the environment-variables page (EN +
ZH) and to .env.example, with the reverse-proxy caveat that without
RATE_LIMIT_TRUSTED_PROXIES every user shares the proxy IP and the
whole deployment ends up in one bucket.
Follow-up to #2636. MUL-2251.
Co-authored-by: multica-agent <github@multica.ai>
* feat(server): add webhook trigger DB migration + sqlc queries
Lays the foundation for webhook autopilot triggers:
- partial unique index on autopilot_trigger.webhook_token (kind=webhook only)
so the public ingress route can resolve a trigger in O(1)
- GetWebhookTriggerByToken / TouchAutopilotTriggerFiredAt /
RotateAutopilotTriggerWebhookToken / SetAutopilotTriggerWebhookToken
queries, regenerated with sqlc
* feat(server): webhook token generator + payload normalizer
Two pure helpers for the webhook autopilot work:
- generateWebhookToken: 32 random bytes -> base64-url, "awt_" prefix.
256 bits of entropy keeps brute-force off the table; the prefix makes
leaked tokens recognisable in logs.
- normalizeWebhookPayload: turns arbitrary JSON into the WebhookEnvelope
shape (event/eventPayload/request) used by trigger_payload. Header- and
body-based event inference covers GitHub, GitLab, X-Event-Type, and
caller-provided envelopes; scalar/empty/invalid bodies are rejected so
the handler can answer 400.
* feat(server): generate webhook tokens and expose rotate endpoint
- New handler.Config.PublicURL fed by MULTICA_PUBLIC_URL env so
/api/autopilots/.../triggers responses can include an absolute
webhook_url alongside the always-present webhook_path.
- CreateAutopilotTrigger now mints a webhook_token via crypto/rand
for kind=webhook and ignores cron/timezone for non-schedule kinds.
api triggers stay accepted-but-inert per PLAN.md.
- New POST /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token
protected by the existing workspace auth group; old tokens stop
working immediately because the unique-index lookup keys on the
current row value.
* feat(server): public webhook ingress route + per-token rate limiter
- New POST /api/webhooks/autopilots/{token} route, mounted outside the
authenticated group: the path token is the credential. Workspace
context is derived from the joined autopilot row, never headers.
- Body capped at 256 KiB via http.MaxBytesReader; oversized payloads
return 413 mid-read instead of being fully buffered.
- Disabled triggers / paused / archived autopilots return
200 {"status":"ignored"} so providers stop retrying.
- Skipped-runtime dispatches surface 200 {"status":"skipped"} with the
reason from the autopilot service's pre-flight admission check.
- WebhookRateLimiter interface with sliding-window in-memory + Redis
Lua-script implementations. Default 60 req/min per token. Test
coverage on the in-memory path; Redis variant fails open on cache
errors so a Redis hiccup never blocks ingress.
- Integration tests exercise token generation, dispatch, payload
envelope persistence, GitHub-header inference, paused/disabled
short-circuits, oversized rejection, and rotate-then-old-token-404.
* feat(server): include webhook payload in create_issue description
When an autopilot run is triggered by a webhook and execution_mode is
create_issue, the agent only sees the issue body — never the run's
trigger_payload. Append a 'Webhook event:' line and a fenced JSON block
with the normalized eventPayload so the agent has the inbound context
inline. Schedule / manual runs are unchanged.
Tests cover:
- schedule path keeps existing italic note, no webhook block
- webhook path emits event line + payload block, italic before block
- non-envelope JSON falls back to raw body (defensive)
- non-webhook source with payload still gets no webhook block
* feat(core): types, API client and mutations for webhook triggers
- AutopilotRunStatus gains 'skipped' so the run-list UI handles the
admission-skipped state explicitly instead of falling through to a
generic case (the backend already emits it via MUL-1899).
- AutopilotTrigger picks up optional webhook_path / webhook_url. Both
are optional so older self-hosted servers that pre-date this change
still parse cleanly.
- buildAutopilotWebhookUrl helper composes a usable absolute URL with
the priority webhook_url > apiBaseUrl + path > origin + path > path.
Tested with seven cases covering each branch.
- ApiClient.rotateAutopilotTriggerWebhookToken posts to
/api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token; the
HTTP-contract test pins URL + method.
- useRotateAutopilotTriggerWebhookToken mutation invalidates
autopilotKeys.detail on settle, mirroring the existing trigger-mutation
pattern.
* feat(views): webhook trigger UI in Add Trigger dialog and trigger row
Add Trigger dialog gains a Schedule/Webhook segmented toggle:
- Schedule reuses TriggerConfigSection unchanged.
- Webhook hides the cron config and shows a help line; the trigger is
created with kind=webhook and the URL is generated server-side.
- Toast text differentiates schedule vs webhook on success.
TriggerRow grows a webhook branch:
- Webhook icon, kind translated via trigger_kind.
- URL shown in a truncating monospace pill, with copy + rotate
buttons. Copy uses navigator.clipboard with toast feedback; rotate
uses an AlertDialog confirm because the old URL stops working
immediately.
- api triggers render a Deprecated badge and skip URL/copy/rotate
affordances.
RunRow gains a 'skipped' RUN_VISUAL entry (muted dash) so admission-
skipped runs don't fall through to a generic case. Source label uses the
new run_source i18n key instead of capitalize.
Locales: en + zh-Hans gain run_status.skipped, run_source.*,
trigger_kind.*, trigger_row.{copy_url,rotate_url,*_confirm_*,toast_*},
add_trigger_dialog.{type_*,webhook_help,toast_added_{schedule,webhook}}.
* feat(cli): support webhook trigger creation and URL rotation
- multica autopilot trigger-add now takes --kind schedule|webhook
(default schedule for backward compatibility). For webhook it skips
--cron / --timezone validation and prints the resulting webhook URL,
preferring the server-provided webhook_url and falling back to
client.BaseURL + webhook_path.
- New multica autopilot trigger-rotate-url <autopilot-id> <trigger-id>
command for rotating the bearer URL of a webhook trigger.
* docs(autopilots): add webhook trigger guide (en + zh)
Replaces the 'Webhook and API triggers are not available yet' section
with end-to-end webhook documentation: how the URL is generated, what
payload shapes are accepted, the inferred-event rules, the bearer-secret
warning + rotate flow, status-code semantics for accepted/skipped/
ignored/4xx/5xx outcomes, and the MULTICA_PUBLIC_URL self-host
configuration.
Run history list now mentions skipped status. The 'unavailable
features' section narrows to api-kind triggers, HMAC signing, IP
allowlists, and provider presets.
* feat(views): add Schedule/Webhook toggle to the create autopilot dialog
Closes the gap where a brand-new autopilot could only be created with a
schedule trigger. The right-column config now has a Trigger section
with a segmented Schedule/Webhook control:
- Schedule keeps the existing cron/timezone UI.
- Webhook hides the cron UI and shows a help line; on submit, a
kind=webhook trigger is created right after the autopilot.
In edit mode the toggle is intentionally hidden (PLAN.md treats trigger-
type changes as delete-old + create-new, not in-place updates), but the
panel still picks the right kind based on props.triggers[0].kind so a
webhook autopilot doesn't render an irrelevant cron form.
Locales: section_trigger_kind, trigger_kind_{schedule,webhook},
section_webhook, webhook_help_{create,edit} added in en + zh-Hans.
* feat(views): show webhook URL inline after creating a webhook autopilot
After a successful create with kind=webhook, the dialog stays open and
swaps to a confirmation panel showing the freshly minted URL with a
copy button + 'Treat this URL like a password' warning + Done button.
Avoids the friction of "create the autopilot, then go find it in the
list, click in, scroll to triggers, copy URL."
Locales: dialog.webhook_created_{title,description,warning,done} added
in en + zh-Hans.
Schedule create flow is unchanged (toast + close). The success panel is
gated on the trigger returned from the create mutation, so a partial
failure (autopilot created, trigger creation errored) still falls
through to the toast_create_partial path.
* feat(views): show webhook payload in run detail dialog
The agent transcript dialog now accepts an optional headerSlot that
sits above the event list. The autopilot RunRow drops a
WebhookPayloadPreview into that slot when the run came from a webhook
and trigger_payload is non-empty.
The preview is collapsed by default (the transcript itself is the main
event), shows the inferred event name + receivedAt in the header, and
reveals the eventPayload as pretty-printed JSON with a copy button on
expand. Falls back gracefully if the row's trigger_payload doesn't
match the WebhookEnvelope shape — the whole value is shown instead so
nothing is hidden.
Closes the "agent didn't echo the payload, now I can't see what
triggered the run" gap. PLAN.md tracked this as
"Payload preview in run history" under follow-ups.
Locales: webhook_payload.{label, unknown_event, payload, content_type,
copy, copied, copied_short, copy_failed} added in en + zh-Hans.
* chore(server): wire MULTICA_PUBLIC_URL through self-host compose
Two small follow-ups split out of the webhook trigger PR:
- docker-compose.selfhost.yml passes MULTICA_PUBLIC_URL into the
backend container so a self-hosted deployment behind a real domain
gets absolute webhook URLs in the trigger response. Documented in
.env.example with the rationale for not deriving the public host
from request headers.
- Drop a duplicated 'invalid json:' prefix in the webhook ingress
400 error path. normalizeWebhookPayload already prefixes its
errors, so the handler doesn't need to re-prefix.
* fix(migrations): renumber webhook trigger migration 081 → 089 to avoid collision
The branch's 081_autopilot_webhook_triggers.{up,down}.sql collided
numerically with 081_runtime_timezone.{up,down}.sql that landed on
main, making migration apply order undefined. Renumber to 089 so the
file slots after the latest main migration (088_squad_instructions).
The SQL itself doesn't conflict — it only creates a partial unique
index on autopilot_trigger.webhook_token — but the duplicate prefix
is what the migration runner sees, so the filename must move.
* fix(autopilot-webhook): address PR review blocking issues
- Redact bearer tokens from request logs: paths matching
/api/webhooks/autopilots/<token> now log "[redacted]" instead of the
token. The resolved trigger ID is plumbed via context so audit lines
stay useful for debugging. (Review item Blocking #1.)
- Distinguish pgx.ErrNoRows from transient DB errors in token lookup:
no-row stays 404 (so providers don't retry on a deleted webhook),
other errors return 500 (which providers DO retry, avoiding silent
drops on DB blips). (Review item Blocking #2.)
- Add per-IP sliding-window rate limiter that runs BEFORE the token
lookup, so spraying random tokens can no longer probe the
autopilot_trigger index unboundedly. Reuses the existing Lua script
with a separate Redis key namespace; falls open on Redis errors.
Default budget 30 req/min/IP. (Review item Blocking #3.)
The webhook handler now applies the gates in the order: per-IP rate
limit → token lookup → per-token rate limit → handler logic.
* fix(autopilot): atomic webhook trigger creation + strict kind/timezone validation
- Mint the webhook bearer token BEFORE the INSERT and pass it via
CreateAutopilotTriggerParams so the row never exists in a half-written
kind=webhook + webhook_token=NULL state. On the (vanishingly rare)
unique-index collision the whole INSERT is retried with a fresh token
— no UPDATE second step. Removes the now-dead attachFreshWebhookToken
helper. (Review item Recommended #4.)
- Add new GET /api/autopilots/{id}/runs/{runId} endpoint that returns a
single run including the full trigger_payload. The list response is
now slim (omits trigger_payload) so worst-case payload size drops
from ~5 MB to ~5 KB. (Review item Recommended #5, server side.)
- Reject kind=api with 400 ("kind=api is deprecated; use schedule or
webhook") and reject kind=webhook with --timezone with 400 — both
surfaces stragglers loudly instead of silently dropping fields.
CLI mirrors the check so --timezone with --kind webhook errors
client-side. (Review nits.)
- Add --yes (-y) flag and an interactive y/N confirmation prompt to
`multica autopilot trigger-rotate-url` so the destructive rotate
matches the UI's AlertDialog safety. (Review item Recommended #6.)
* fix(views): fetch webhook payload on-demand and truncate at 4 KiB
- Add useAutopilotRun query hook + getAutopilotRun API client method
paired with the new server endpoint. The run-detail dialog now mounts
a WebhookPayloadSlot that fetches the full run (incl. trigger_payload)
lazily — list responses no longer carry up to 256 KiB × N runs of
envelope data.
- WebhookPayloadPreview truncates its in-DOM <pre> at 4 KiB with a
localized marker so jank-y machines aren't asked to render a 256 KiB
JSON blob. The Copy button still yields the full string.
- Adds the truncated_marker i18n string to en + zh-Hans.
Review items Recommended #5 (frontend) and a nit on the preview's
unbounded <pre>.
* test(autopilot-webhook): close coverage gaps flagged in PR review
- request_logger: redactWebhookPath unit tests + integration test
proving the bearer token never lands in slog output, plus the
webhook_trigger_id context plumbing.
- autopilot_webhook_handler: empty body → 400, archived autopilot →
200 ignored, per-IP rate limiter trips before DB lookup, kind=api
and webhook+timezone are rejected at 400, slim list + full detail
endpoint round-trip.
- webhook_rate_limiter: Lua script structure guard (catches reordering
even without a live Redis), plus live-Redis tests for both per-token
and per-IP limiters (REDIS_TEST_URL gated, matching the existing
Redis test pattern in the package).
- WebhookPayloadPreview: envelope rendering, fallback shape, and the
>4 KiB truncation path with full-payload-on-Copy guarantee.
Two branches are documented as code-review-protected rather than
covered by tests: the 500-on-DB-error path requires injecting a stub
Queries (no interface here), and the cross-workspace defense-in-depth
check is unreachable from valid SQL state.
* fix(middleware): SetWebhookTriggerID must mutate request in place
The round-1 helper returned a fresh *http.Request from WithContext, and
the webhook handler did `r = SetWebhookTriggerID(r, ...)`. That swaps
the handler's local pointer but doesn't propagate the new context back
to RequestLogger, which is still holding the original *http.Request —
so the audit line never actually included webhook_trigger_id in
production. The round-1 test happened to pass because it pre-stashed
the value on the request before calling ServeHTTP, bypassing the bug
it was meant to verify.
Switch to in-place mutation via `*r = *r.WithContext(...)` so the
wrapping middleware sees the new context after next.ServeHTTP returns,
and update the test to exercise the real call pattern (set the context
from inside the handler, assert the surrounding logger reads it).
Verified live: an accepted webhook now logs
path=/api/webhooks/autopilots/[redacted] webhook_trigger_id=<uuid>
* fix(autopilot-webhook): symmetric ErrNoRows split + trusted-proxy gate
Round-2 review (Bohan-J, PR #2348 follow-up):
- Must-fix #1: the second lookup at autopilot_webhook.go:258
(GetAutopilot after the token resolves) was folding every error into
404. A transient DB blip would tell a webhook sender "not found" and
it would never retry. Apply the same errors.Is(err, pgx.ErrNoRows)
→ 404 / else → 500 split as the first lookup got in round 1.
- Must-fix #2: clientIPForRateLimit was honoring X-Forwarded-For /
X-Real-IP from any caller. An attacker spraying random tokens could
just rotate the XFF header and the per-IP bucket became per-request,
so the limiter that's specifically supposed to gate spraying before
it hits the DB unique index was bypassed.
New shape — matches Bohan's suggestion exactly:
* Default: r.RemoteAddr only, headers ignored.
* Operator opt-in via MULTICA_TRUSTED_PROXIES (comma-separated
CIDRs). XFF/X-Real-IP are honored only when r.RemoteAddr is
inside one of the listed prefixes; otherwise they're dropped.
Wired through .env.example and docker-compose.selfhost.yml so
self-host operators can configure their reverse-proxy's CIDR.
Invalid CIDRs in the env var are dropped with a single slog.Warn at
startup rather than crashing the server. Uses net/netip (stdlib,
value-typed) for parsing and containment checks.
Verified live on the rebuilt self-host backend: a 35-request spray
from one source with rotating XFF gets the expected 30× 404 + 5× 429,
proving the per-IP bucket is keyed on the real connection IP.
* fix(autopilot): reject cron/timezone PATCH on non-schedule triggers
Round-2 review should-fix. CreateAutopilotTrigger already 400s on
kind=webhook + timezone/cron_expression, but UpdateAutopilotTrigger
silently wrote those fields regardless of prev.Kind. The values then
sat in the DB visible to nobody and read by nothing — a back door that
left the API contract fuzzy across create vs update.
Mirror the create-path discipline: after loading prev, if prev.Kind
!= "schedule" and the PATCH body sets cron_expression or timezone,
return 400 with a clear message. enabled and label remain accepted on
every kind.
The existing prev.Kind == "schedule" guard on next_run_at recompute
stays as belt-and-braces, but with this gate in place the recompute
branch is now reachable only for the kind it was meant for.
* test(autopilot-webhook): close round-2 coverage gaps
- IPRateLimitNotBypassedByXFFSpoof: drives the must-fix #2 invariant
by rotating XFF across three calls from the same RemoteAddr and
asserting the third gets 429. Pre-round-2 this test would have
passed for the wrong reason (limiter trusted XFF, so per-bucket
collision was incidental); now it pins the bypass-closed property.
- IPRateLimitReturns429BeforeDBLookup: updated to set RemoteAddr
explicitly and drop the XFF header it was leaning on. With
TrustedProxies empty (test default) the limiter keys on the real
connection IP, which is what the test wants to assert anyway.
- UpdateAutopilotTrigger_RejectsCronExpressionOnWebhookKind +
UpdateAutopilotTrigger_RejectsTimezoneOnWebhookKind: drive the
round-2 should-fix from the handler boundary.
- UpdateAutopilotTrigger_AcceptsEnabledAndLabelOnWebhookKind: counter
test so a regression to a blanket reject is caught.
* fix(migrations): bump webhook trigger migration 089 → 091
origin/main added 089_squad_no_action_activity_index (and 090_task_is_leader)
since our last rebase, re-colliding with our 089_autopilot_webhook_triggers.
Bump to 091 so the filename ordering is unambiguous again. The SQL is
unchanged — same partial unique index on autopilot_trigger.webhook_token —
only the filename moves.
* fix(views): dedupe skipped icon in autopilot RUN_VISUAL after rebase
The rebase against origin/main merged main's add of `Ban` for the
skipped status next to our round-1 `MinusCircle` entry, leaving the
RUN_VISUAL map with two `skipped` keys (only the last would have been
read at runtime, and MinusCircle had been dropped from the imports
during conflict resolution — so the file would not compile).
Keep main's `Ban` icon (latest design) and a single `skipped` entry.
Carry over the round-1 comment about why the muted styling matters
for failure-ratio readability.
---------
Co-authored-by: Kerim Incedayi <kerim.incedayi@digitalchargingsolutions.com>
* docs(email): clarify 888888 is opt-in via MULTICA_DEV_VERIFICATION_CODE; document SMTP option in self-host docs
The startup log line, .env.example, and SELF_HOSTING_ADVANCED.md still
implied that the dev master code 888888 is auto-active whenever
APP_ENV != "production". That has not been true since the master code
was gated behind MULTICA_DEV_VERIFICATION_CODE — the fixed code is
disabled by default and must be opted in explicitly.
Also extend the docs site with the SMTP relay backend added in #1877:
auth-setup, environment-variables, and self-host-quickstart now cover
both Resend and SMTP options in EN and ZH.
Co-authored-by: multica-agent <github@multica.ai>
* docs(email): treat SMTP as an email backend in self-host docs and startup warning
Address review feedback on #2666:
- server: startup warning now fires only when both RESEND_API_KEY and SMTP_HOST
are empty, since either one is a valid email backend. Otherwise the log
mis-tells SMTP-only operators that verification codes go to stdout.
- self-host-quickstart (EN/ZH): tell readers to fetch the verification code
from whichever backend they configured (Resend or SMTP); fall back to
stdout only when neither is configured.
- auth-setup (EN/ZH): \"without Resend\" → \"without any email backend
configured\" so the wording stays correct now that SMTP is a first-class
option.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* feat(email): add SMTP relay as alternative to Resend
Self-hosted deployments often run behind a corporate firewall with an
existing SMTP relay (Exchange, Postfix, sendmail) and no access to
external SaaS APIs. Resend requires a public domain, an API key, and
outbound HTTPS to api.resend.com — all unavailable in air-gapped or
private-network setups.
This adds a second email delivery path using Go's stdlib net/smtp,
activated when SMTP_HOST is set. Priority order:
1. SMTP relay (SMTP_HOST set)
2. Resend API (RESEND_API_KEY set)
3. DEV stdout (neither set)
New env vars (all optional, no breaking change):
SMTP_HOST — SMTP server hostname
SMTP_PORT — port, default 25
SMTP_USERNAME — for authenticated SMTP; empty = unauthenticated relay
SMTP_PASSWORD — used only when SMTP_USERNAME is set
SMTP_TLS_INSECURE — set to "true" to skip TLS cert verification
(for private CA / self-signed certs)
The implementation:
- Dials TCP, creates smtp.Client manually (avoids smtp.SendMail which
does not expose TLS config)
- Tries STARTTLS if advertised; uses InsecureSkipVerify only when
SMTP_TLS_INSECURE=true (opt-in, nolint:gosec annotated)
- Applies PlainAuth only when SMTP_USERNAME is non-empty
- Wraps all errors with context for easier debugging
- Reuses existing HTML templates from buildInvitationParams for
invitation emails (no template duplication)
Also updates .env.example and docker-compose.selfhost.yml with the
new variables and inline documentation.
* fix(email): add dial timeout, session deadline, RFC headers for SMTP path
Address review blockers from multica-eve and Bohan-J (PR #1877):
- net.Dial → net.DialTimeout(10s) + conn.SetDeadline(30s) so a blackholed
SMTP relay cannot hang SendVerificationCode (called synchronously from the
auth handler) or leak goroutines in the invitation path.
- Add Date, Message-ID, and proper Content-Transfer-Encoding headers.
Date is required by RFC 5322; many strict relays reject messages without it.
Message-ID aids deliverability and threading.
- MIME-encode Subject via mime.QEncoding so non-ASCII workspace/inviter names
(CJK, emoji) survive without corruption across any RFC 2047-conformant relay.
- Probe 8BITMIME after (possible) STARTTLS: use Content-Transfer-Encoding 8bit
when the relay advertises 8BITMIME, quoted-printable otherwise — safe for
all relay configurations without forcing base64 overhead.
- Update SELF_HOSTING_ADVANCED.md to document Option B (SMTP relay) alongside
the existing Resend section, including all five env vars and a note that
port 465/SMTPS is not yet supported.
* fix(email): correct has8Bit assignment order (bool is first return of Extension)
The GitHub App integration code reads these two env vars and only enables
the Connect flow when both are set. .env.example never listed them, and
docker-compose.selfhost.yml did not forward them into the backend
container, so self-hosters following the integration docs had no working
way to turn the feature on.
MUL-2107
Co-authored-by: multica-agent <github@multica.ai>
* fix(storage): build region-qualified S3 public URLs (#2051)
The uploadedURL fallback (no CloudFront, no custom endpoint) wrote
"https://<bucket>/<key>" — missing the ".s3.<region>.amazonaws.com"
suffix — so any deployment that pointed S3_BUCKET at a real AWS bucket
without a CDN got broken image URLs back to the client. Avatar URLs
were persisted in this broken form on the user/agent rows, so profile
pictures uploaded via the SDK never rendered.
- Track S3_REGION on S3Storage and emit
https://<bucket>.s3.<region>.amazonaws.com/<key> by default;
fall back to path-style https://s3.<region>.amazonaws.com/<bucket>/<key>
when the bucket name contains dots, since the AWS wildcard cert
can't validate dotted virtual-hosted hosts.
- Teach KeyFromURL to recognise the new region-qualified hosts (both
styles) and keep recognising the legacy bucket-only host so historical
records can still be deleted/migrated.
- Document that S3_BUCKET is the bucket name only, not a hostname,
in env-vars docs (en+zh), self-hosting guides, and .env.example.
Co-authored-by: multica-agent <github@multica.ai>
* feat(storage): warn at startup when S3_BUCKET looks like a hostname
Catches the most common misconfiguration shape (S3_BUCKET set to
"<bucket>.s3.<region>.amazonaws.com") with a startup log line so
operators don't silently end up with a config that signs uploads
against an invalid bucket name.
A real bucket name can never legitimately contain "amazonaws.com",
so the check is a single substring match — no false positives
worth carving out.
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
* Restrict /health/realtime metrics exposure (MUL-1342)
The realtime metrics endpoint was registered on the public router with
no authentication, exposing per-event/per-scope counters, redis.last_error,
and redis.node_id to anonymous callers. This enables information disclosure
and traffic profiling.
Move the handler behind a token + loopback policy:
- If REALTIME_METRICS_TOKEN is set, require Authorization: Bearer <token>
using a constant-time compare. Reject other callers with 401 plus a
WWW-Authenticate hint.
- If the env var is unset, only serve loopback callers and return 404 to
remote clients so the endpoint is not enumerable. This keeps local dev
workflows working without configuration.
The handler is extracted into health_realtime.go with focused unit tests
covering the token, loopback, and rejection paths. .env.example documents
the new variable.
Refs: https://github.com/multica-ai/multica/issues/1606
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fail closed for proxied /health/realtime requests (MUL-1342)
Addresses review on PR #1608: when the server runs behind a reverse
proxy (Caddy / Nginx -> localhost:8080), public callers reach the Go
handler with RemoteAddr=127.0.0.1, so the previous loopback shortcut
exposed the metrics surface in self-hosted deployments.
The no-token path now treats any forwarding header
(X-Forwarded-For / -Host / -Proto, X-Real-Ip, Forwarded) as a
'this request was proxied, can't attribute, fail closed' signal and
returns 404. Direct loopback callers without those headers still work
for local dev. Token-gated path is unchanged.
Tests cover all listed proxy headers (incl. multi-hop XFF chain and
RFC 7239 Forwarded) over both 127.0.0.1 and ::1, plus a regression
case ensuring an empty/whitespace forwarding header does not break
direct loopback access. .env.example updated to call out that proxied
deployments must configure REALTIME_METRICS_TOKEN.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: CC-Girl <cc-girl@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Publish stable GHCR self-host images, switch self-host deploys to official image pulls with a source-build fallback, and move self-host signup / Google OAuth config onto runtime /api/config.
* feat(analytics): add PostHog client with async batch shipping
Introduces server/internal/analytics, the shipping layer for the product
funnel defined in docs/analytics.md. Capture is non-blocking — events are
enqueued into a bounded channel and a background worker batches them to
PostHog's /batch/ endpoint. A broken backend drops events rather than
blocking request handlers.
Local dev and self-hosted instances run a noop client until the operator
sets POSTHOG_API_KEY. This is PR 1 of MUL-1122; signup and workspace_created
emission land in the follow-up commit so this change is independently
reviewable.
* feat(server): emit signup and workspace_created analytics events
Wires analytics.Client through handler.New and main, then emits the first
two funnel events:
- signup fires from findOrCreateUser (which now reports isNew), covering
both the verification-code and Google OAuth entry points — a single
emission site guarantees Google signups aren't missed.
- workspace_created fires after the CreateWorkspace transaction commits,
with is_first_workspace computed from a post-commit ListWorkspaces count
so we can distinguish fresh-user activation from returning-user
expansion.
Tests use analytics.NoopClient so nothing ships from test runs. PR 1 of
MUL-1122; runtime_registered and issue_executed follow in later PRs per
the plan.
* refactor(analytics): drop is_first_workspace from workspace_created
Stamping "is this the user's first workspace?" at emit time races under
concurrent CreateWorkspace requests: two transactions committing close
together can both read a post-commit count greater than one and both emit
false. Fixing it at the SQL layer requires a schema change we don't want in
PR 1.
PostHog answers the same question exactly from the event stream (funnel on
"first time user does X" / cohort on $initial_event), so removing the
property loses no information and makes the emit side race-free.
* docs(analytics): document self-host safety defaults
Spell out why self-hosted instances never ship events upstream by default
(empty POSTHOG_API_KEY → noop client) and explain how operators can point
at their own PostHog project without any code change.
* feat(analytics): emit runtime_registered, issue_executed, team_invite_*
Three server-side funnel events, all gated on first-time state transitions
so retries and re-runs don't inflate the WAW buckets:
- runtime_registered fires from DaemonRegister when UpsertAgentRuntime
reports (xmax = 0) — i.e. the row was inserted, not updated. Heartbeats
and re-registrations stay silent.
- issue_executed fires from CompleteTask after an atomic
UPDATE issue SET first_executed_at = now() WHERE id = $1 AND
first_executed_at IS NULL flips the column for the first time. Retries,
re-assignments, and comment-triggered follow-up tasks hit the WHERE
clause and no-op. Carries nth_issue_for_workspace so the ≥1/≥2/≥5/≥10
buckets filter without extra queries.
- team_invite_sent fires from CreateInvitation and team_invite_accepted
from AcceptInvitation, closing the expansion funnel.
Adds a 050 migration for issue.first_executed_at plus a partial index so
the workspace-scoped executed-count query doesn't scan the never-executed
tail.
* feat(config): surface PostHog key via /api/config
Extends AppConfig with posthog_key / posthog_host sourced from env on
every request (so operators can rotate the key via secret refresh without
a restart). Reading the key off the server — rather than baking it into
the frontend bundle via NEXT_PUBLIC_* — means self-hosted instances
inherit the blank key automatically and never ship events upstream.
* feat(analytics): wire posthog-js identify + UTM capture on the client
Adds @multica/core/analytics — a thin wrapper around posthog-js that owns
attribution capture and identity merge. Posthog-js config comes from
/api/config (not NEXT_PUBLIC_*), so self-hosted instances whose server
returns an empty key automatically run the SDK inert.
captureSignupSource stamps a multica_signup_source cookie with UTM params
and the referrer's origin (never the full referrer — that can leak OAuth
code/state in the callback URL). The backend signup event reads this
cookie on new-user creation.
Identity flows:
- auth-initializer fires identify() right after getMe() resolves, on both
cookie and token paths. A getConfig/getMe race is handled by buffering
a pending identify inside the analytics module and flushing it once
initAnalytics finishes.
- auth store calls identify() on verifyCode / loginWithGoogle /
loginWithToken and resetAnalytics() on logout so the next login merges
cleanly without bleeding events.
* docs(analytics): describe runtime_registered, issue_executed, invite events
Fills in the schema for the remaining funnel events. Captures the
design commentary that belongs next to the contract rather than in a PR
description — in particular why issue_executed uses the atomic
first_executed_at flip instead of counting task-terminal events, and why
runtime_registered relies on xmax = 0 rather than a query-then-write.
* fix(analytics): drop non-atomic nth_issue_for_workspace from issue_executed
Computing the workspace's Nth-issue ordinal at emit time is not atomic
under concurrent first-completions — two transactions can both run
MarkIssueFirstExecuted, then both run CountExecutedIssuesInWorkspace, and
both observe count=1 before either has committed, so both events go out
stamped as n=1. Serialising it would mean a per-workspace advisory lock
or a SERIALIZABLE-isolated tx; PostHog answers the same question exactly
at query time via row_number() partitioned by workspace_id, so the
emit-time property adds risk without adding information.
Removes the property from analytics.IssueExecuted, deletes the unused
CountExecutedIssuesInWorkspace query, and regenerates sqlc. The partial
index stays — any future workspace-scoped executed-issue query will want
it.
* fix(analytics): wire $pageview and harden signup_source cookie payload
Two frontend fixes from the PR review:
- PageviewTracker, mounted under WebProviders, fires capturePageview on
every Next.js App Router path / query-string change. Without this the
capturePageview helper in @multica/core/analytics was never called and
the acquisition funnel's / → signup step was empty.
- captureSignupSource now caps each UTM / referrer value at 96 chars
*before* JSON.stringify, and drops the whole cookie when the serialised
payload still exceeds 512 chars. Previously the overall slice(0, 256)
could leave a half-JSON string on the wire that neither the backend nor
PostHog could parse.
Both capturePageview and identify now buffer a single pending call when
fired before initAnalytics resolves — otherwise the initial "/" pageview
and same-turn login identify race the /api/config fetch and get dropped.
resetAnalytics clears both buffers so a logout→login cycle stays clean.
* fix(analytics): URL-decode signup_source cookie on read
Go does not URL-decode Cookie.Value automatically, so the frontend's
JSON-then-encodeURIComponent payload was landing in PostHog as
percent-encoded garbage (%7B%22utm_source...). Unescape on read so the
backend receives the original JSON string the frontend intended, and
drop values that fail to decode or exceed the server-side cap — sending
truncated garbage is worse than sending nothing. Oversized-cookie guard
matches the frontend's SIGNUP_SOURCE_MAX_LEN.
* docs(analytics): reflect nth-issue drop, $pageview wiring, cookie encoding
Pulls the schema doc back in line with the code: issue_executed no longer
advertises nth_issue_for_workspace (with a note about why PostHog derives
it at query time instead), the frontend $pageview section names the
actual PageviewTracker component that fires it, and the signup_source
section documents the per-value cap / overall drop rule and the
encode-on-write / decode-on-read contract.
---------
Co-authored-by: Jiang Bohan <bhjiang@outlook.com>
The session cookie's Secure flag was tied to APP_ENV, and the
docker-compose self-host stack defaults APP_ENV to "production". On
plain-HTTP self-host deployments (LAN IP, private network) the browser
silently drops Secure cookies, leaving every subsequent /api/* call
anonymous and surfacing as 401 "auth: no token found" right after a
successful login.
Derive Secure from the scheme of FRONTEND_ORIGIN so HTTPS origins get
Secure cookies and plain-HTTP origins get non-secure cookies the
browser will actually store. Also harden cookieDomain() against the
other common trap: COOKIE_DOMAIN=<ip>, which RFC 6265 forbids and
browsers reject. Log a one-shot warning and fall back to host-only.
Docs: correct the COOKIE_DOMAIN description (it was labelled as
CloudFront-only but applies to session cookies too) and call out the
IP-literal pitfall in SELF_HOSTING_ADVANCED.md, self-hosting.mdx, and
.env.example.
Refs #1321
* feat(server): configurable pgxpool size with sane defaults
pgxpool.New(ctx, url) silently sets MaxConns = max(4, NumCPU). On the
prod pods that resolved to 4, which got fully saturated by daemon
claim/heartbeat traffic (~3800 acquires/s) and showed up as ~900ms
acquire waits on every query — the actual root cause of the 3s+
/tasks/claim tail latency. The db pool stats logging from #1378
confirmed this with empty_acquire_delta == acquire_count_delta.
Switch to pgxpool.ParseConfig + NewWithConfig and apply per-pod
defaults of MaxConns=25 / MinConns=5, both overridable via env vars
(DATABASE_MAX_CONNS / DATABASE_MIN_CONNS) so the size can be tuned
in prod without a redeploy.
The defaults follow the standard 'small pool, lots of waiters' guidance
for Postgres (PG community / HikariCP formula
`(core_count * 2) + effective_spindle_count`); 25 leaves headroom for
bursts and occasional long queries while staying safely under typical
managed-Postgres max_connections ceilings when multiplied across pods.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(server): respect DATABASE_URL pool_* params; add precedence tests
Address review feedback on #1381:
- Configuration precedence is now explicit: DATABASE_MAX_CONNS env >
pool_max_conns query param on DATABASE_URL > built-in default. Same
for min_conns. Previously the env-empty path unconditionally
overwrote whatever ParseConfig had read from the URL — a silent
regression for deployments that already tuned pool size via the
connection string.
- Add unit tests in dbstats_test.go covering each precedence branch
(defaults, URL-only, env-over-URL, partial URL, invalid env,
min>max clamp).
- Move pool tuning vars out of 'Required Variables' into a new
'Database Pool Tuning (Optional)' section in SELF_HOSTING_ADVANCED.md
so self-hosters don't think they need to set them.
- Add commented entries in .env.example.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(server): invalid pool env falls back to URL/code default, never pgx 4
Address second round of review on #1381:
Previous code passed cfg.MaxConns / cfg.MinConns as the envInt32 fallback,
which meant an invalid DATABASE_MAX_CONNS value silently fell back to
ParseConfig's value — i.e. pgx's built-in default of 4/0 when the URL had
no pool_* params. That's exactly the bad value this PR exists to remove,
and the previous test (TestPoolSizing_InvalidEnvFallsBack) accidentally
locked it in.
Compute the non-env fallback first (URL pool_* if present, else code
default 25/5) and pass that to envInt32. Misconfigured env now lands on
the same value as if the env were unset — never on the pgx default.
Replace the loose 'max > 0' assertion with two precise tests:
- invalid env + no URL param → code default (25/5)
- invalid env + URL param → URL value
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The v0.2.6 self-host security fix (#1307) defaults APP_ENV to "production"
in docker-compose.selfhost.yml, which disables the 888888 master verification
code. The follow-up docs PR (#1313) covered SELF_HOSTING.md and the
installers, but `.env.example` — the file users actually copy and edit —
still makes no mention of APP_ENV, so operators who don't read the prose
docs hit the exact same "888888 stopped working after upgrade" confusion
reported in #1331.
- Add APP_ENV= to .env.example with a comment block that spells out the three
cases (Docker default, local dev, evaluation) and warns against enabling
the bypass on public instances. Keeping the value empty preserves the
current `make dev` UX (Go server reads empty → treats as non-production →
888888 works locally) while `${APP_ENV:-production}` in the compose file
still ensures public Docker deployments are safe by default.
- Update the existing 888888 mention under # Email so it no longer
contradicts the new gating rule.
- Update the `make selfhost` post-start banner, which still told operators
to "Log in with any email + verification code: 888888" even after #1307
disabled that path by default.
Closes#930
- Added environment variables to control signups
- Updated frontend to hide signup text when disabled
- Added backend check to block new user creation via magic link
- Updated .env.example
The .env.example had hardcoded http://localhost:8080 defaults for
NEXT_PUBLIC_API_URL and NEXT_PUBLIC_WS_URL. When users copied .env.example
to .env and customized the backend port, the old defaults would still get
baked into the frontend at docker build time via NEXT_PUBLIC_WS_URL build
arg, causing API/WebSocket connection failures.
With empty defaults:
- Docker selfhost: frontend uses relative paths, Next.js rewrites proxy
to backend internally — works regardless of external port config
- Local dev (make dev): Makefile sets these to localhost:$PORT automatically
- Browser fallback: deriveWsUrl() auto-derives WebSocket URL from page
origin when NEXT_PUBLIC_WS_URL is empty
Closes#1055
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(auth): migrate auth token to HttpOnly cookie & implement WebSocket Origin whitelist
Security improvements from the MUL-566 audit report:
1. Auth token is now set as an HttpOnly, SameSite=Lax cookie on login,
preventing XSS-based token theft. Cookie-based auth includes CSRF
protection via double-submit cookie pattern. The Authorization header
path is preserved for Electron desktop app and CLI/PAT clients.
2. WebSocket upgrader now validates the Origin header against a
configurable allowlist (ALLOWED_ORIGINS env var), rejecting
connections from unauthorized origins.
Backend: new auth cookie helpers, middleware reads cookie as fallback,
WS handler accepts cookie auth, Origin whitelist, logout endpoint.
Frontend: CSRF token in API headers, cookie-aware auth store and WS
client, web app opts into cookieAuth mode while desktop keeps tokens.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(auth): address PR review — Strict cookies, HMAC-bound CSRF, origin sync
1. SameSite=Lax → SameSite=Strict per spec requirement
2. CSRF token now HMAC-signed with auth token (nonce.signature format),
preventing subdomain cookie injection attacks
3. allowedWSOrigins uses atomic.Value to eliminate data race
4. Removed magic "cookie" sentinel string in WSProvider — pass null token
and guard with boolean check instead
5. Removed dead delete uploadHeaders["Content-Type"] in API client
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(storage): add local file storage fallback
- Add local storage implementation for file uploads
- Update .env.example with LOCAL_UPLOAD_DIR and LOCAL_UPLOAD_BASE_URL
- Integrate local storage into server router and handlers
- Add storage abstraction layer with util functions
* ♻️ refactor(storage): improve path handling and file serving
switch from path to filepath for better cross-platform support and replace manual file serving logic with http.ServeFile to enhance security against path traversal. update unit tests to use t.Setenv for cleaner environment variable management.
* fix(auth): log email send errors and gracefully degrade in non-production
In non-production environments (APP_ENV != "production"), if sending the
verification code email fails, log the error as a warning and still return
success. This lets self-hosting users log in with the master code (888888)
even when their Resend configuration is incomplete (e.g. unverified from-domain).
In production, the behavior is unchanged — email failures return 500.
Also adds guidance in .env.example about RESEND_FROM_EMAIL for self-hosters.
Closes#723
* fix(auth): remove APP_ENV degradation, keep error logging only
Remove the APP_ENV-based graceful degradation for email send failures
— it's risky if users forget to set APP_ENV=production. Instead, always
return 500 on email failure (safe for production) and rely on the error
log (slog.Error) with the actual Resend error for debugging.
Self-hosters who don't need real emails should leave RESEND_API_KEY empty
(codes print to stdout, master code 888888 works).
Support Google login that links to existing accounts by email.
When a user who registered via email OTP signs in with Google using
the same email, they are linked to the same account.
Backend:
- Add POST /auth/google endpoint that exchanges Google auth code for
tokens, fetches user profile, and calls findOrCreateUser()
- Updates user name and avatar from Google profile on first Google login
Frontend:
- Add "Continue with Google" button on login page (shown when
NEXT_PUBLIC_GOOGLE_CLIENT_ID is configured)
- Add /auth/callback page to handle Google OAuth redirect
- Add loginWithGoogle to auth store and API client
- Load root .env in next.config.ts so REMOTE_API_URL is available
- Default fallback remains localhost:8080 (no impact on existing setups)
- Add REMOTE_API_URL to .env.example with documentation
Add POST /api/upload-file endpoint that uploads files to S3 and returns
CDN URLs protected by CloudFront signed cookies (same pattern as Linear).
Infrastructure:
- Two private S3 buckets (static.multica.ai, static-staging.multica.ai)
- Two CloudFront distributions with OAC and Trusted Key Groups
- ACM wildcard cert in us-east-1, DNS records in Route 53
- RSA signing key stored in AWS Secrets Manager
Backend:
- S3 storage service with CloudFront CDN domain support
- CloudFront signed cookie generation (RSA-SHA1)
- Private key loaded from Secrets Manager (env var fallback for local dev)
- Cookies set on login (VerifyCode) with 72h expiry matching JWT
- Upload handler: multipart form → S3 → CloudFront URL response
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(auth): add email verification login flow with 401 auto-redirect
Replace the old OAuth-based login with email verification codes:
- Backend: send-code / verify-code endpoints, verification_codes table (migration 009), rate limiting, Resend email service
- Frontend: two-step login UI (email → 6-digit OTP), auth store with sendCode/verifyCode
- SDK: ApiClient gains onUnauthorized callback; 401 responses auto-clear token and redirect to /login
- Fix login button staying disabled due to global isLoading state
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(auth): add brute-force protection, redirect loop guard, and expired code cleanup
- VerifyCode: increment attempts on wrong code, reject after 5 failed tries (migration 010)
- onUnauthorized: skip redirect if already on /login to prevent infinite loops
- SendCode: best-effort cleanup of expired verification codes older than 1 hour
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(auth): add master verification code for non-production environments
Allow code "888888" to bypass email verification in non-production
environments to simplify development and testing workflows.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(auth): add personal access tokens for CLI and API authentication
Add full-stack PAT support: users create tokens in Settings, CLI authenticates
via `multica auth login`. Server stores SHA-256 hashes only. Auth middleware
extended to accept both JWTs and PATs (distinguished by `mul_` prefix).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add .env.example with all supported provider env vars
(OpenAI, Anthropic, DeepSeek, Kimi, Groq, Mistral, Together, Google)
- Update README with environment setup instructions
- Document configuration priority and startup commands
- Whitelist .env.example in .gitignore
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>