Commit Graph

45 Commits

Author SHA1 Message Date
Bohan Jiang
6ac8314711 feat(lark): support both Feishu and Lark from one deployment (MUL-3083) (#3815)
* feat(lark): serve Feishu and Lark from one deployment, per installation

The Lark integration was locked to a single open-platform host chosen
deployment-wide (MULTICA_LARK_HTTP_BASE_URL / _CALLBACK_BASE_URL,
defaulting to open.feishu.cn), so one deployment could talk to only the
mainland Feishu cloud OR Lark international — never both. Teams on the
other tenant could not use the integration at all.

Make the host per-installation. The device-flow installer already
auto-detects the tenant (Lark emits tenant_brand="lark" mid-poll); we now
persist that as lark_installation.region, carry it on
InstallationCredentials.Region, and resolve the open-platform host per
call (REST + WS bootstrap) from the region. An explicit cfg.BaseURL
(env / httptest) still overrides every region, so existing tests and
staging/proxy setups keep working.

- migration 116: lark_installation.region TEXT NOT NULL DEFAULT 'feishu'
  CHECK (region IN ('feishu','lark')) — existing rows are all mainland.
- lark.Region enum + OpenPlatformBaseURL/RegionOrDefault helpers.
- registration: thread the detected region into finishSuccess so the
  install-time GetBotInfo hits the right cloud AND the row records it.
- every credential-build site (patcher, replier, WS provider, union_id
  backfill) copies region off the installation row.
- region is part of the WS supervisor fingerprint so a re-install that
  switches cloud restarts the connection.
- API: surface region on the installation listing DTO.

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): surface installation region in settings UI

Read the per-installation region off the listings response: build the
"Manage in Lark" dev-console host from it (open.feishu.cn vs
open.larksuite.com instead of a hardcoded mainland host) and render a
Feishu / Lark badge on each connected bot. The field is optional and
defaults to Feishu when an older server omits it (API-compat). Adds the
region_feishu / region_lark labels to all four locales.

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* docs(lark): document simultaneous Feishu + Lark support

The cloud each bot belongs to is now auto-detected at install and stored
per installation, so one deployment serves both. Replace the old
"point MULTICA_LARK_HTTP_BASE_URL at larksuite for international tenants"
guidance (now just an optional override) in all four locales.

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* fix(lark): repair legacy Lark-international installs on upgrade

Review follow-up (MUL-3083). Migration 116 backfilled every existing
lark_installation to region='feishu', assuming all historical rows were
mainland. But self-host deployments could already run Lark international
via the deployment-wide MULTICA_LARK_HTTP_BASE_URL override, so those
rows are really Lark — clearing the override after upgrade (which the new
docs invite) would route them to open.feishu.cn and break them.

Add a one-shot startup repair, BackfillRegionFromLegacyOverride, fired
off the hot path like BackfillBotUnionIDs: when the deployment's global
base-URL override targets open.larksuite.com, relabel the still-default
'feishu' rows to 'lark'. Gating on the deployment-wide override is what
makes it safe — every pre-existing install on such a deployment was Lark.
Idempotent; no-op on mainland / fresh deployments. Verified end-to-end
against a scratch DB (flip then 0-row idempotent re-run).

Also document that a Lark/飞书 app_id is globally unique across both
clouds, which is what makes the app_id-keyed token cache and the
UNIQUE(app_id) constraint safe across regions (review nit).

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* docs(lark): fix ops guidance to match auto per-installation region

Review follow-up (MUL-3083). .env.example and docker-compose.selfhost.yml
still told operators that international Lark requires pointing both base
URLs at open.larksuite.com — now wrong, and it would push a fresh
deployment back into a single-cloud override. Rewrite them: the base
URLs are optional deployment-wide overrides; normal dual-cloud operation
keeps them empty. Document the first-boot auto-relabel for deployments
migrating off the old single-cloud override, across the integration docs
(en/zh/ja/ko).

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-05 16:03:13 +08:00
Multica Eve
905ebbdde1 fix(github): populate connected account name on install [MUL-3078] (#3811)
* fix(github): populate connected account name on install [MUL-3078]

The Settings → GitHub connection card was rendering 'Connected to
unknown' because:

1. fetchInstallationAccount in the setup callback hit GitHub's
   /app/installations/{id} endpoint unauthenticated. That endpoint
   requires App JWT auth; the call returned 401, and the function
   fell through to the 'unknown' placeholder which was persisted as
   account_login.
2. The installation webhook handler did upsert the row with the real
   login when GitHub later delivered installation.created, but it
   never published a github_installation:created event. The frontend
   query stayed stale, so the UI kept showing 'unknown' even after
   the row had been refreshed.

Fix:

- Add optional GITHUB_APP_ID + GITHUB_APP_PRIVATE_KEY env vars. When
  set, signGitHubAppJWT mints a short-lived RS256 JWT (back-dated 60s
  for clock skew, capped at 9m to stay inside GitHub's 10m max) and
  fetchInstallationAccount uses it as a Bearer token. The setup
  callback now writes the real org/user name on install.
- When the new env vars are not configured, the call still falls
  through to 'unknown' as before — but the webhook handler now
  publishes EventGitHubInstallationCreated after the upsert, so the
  realtime listener invalidates the installations query and the UI
  converges to the real value within seconds, no manual refresh.

Tests cover JWT signing (claims, signature, malformed PEM, partial
config), fetchInstallationAccount with a JWT-gated httptest mock,
and the webhook refresh + broadcast on a seeded 'unknown' row.

Docs updated for .env.example and github-integration /
environment-variables in en, zh, ja, ko.

Co-authored-by: multica-agent <github@multica.ai>

* test(github): defuse JWT clock-bomb by injecting parser time [MUL-3078]

PR review caught that TestSignGitHubAppJWT_ClaimsAndSignature signed the
token with a fixed 'now' (2026-06-05T12:00:00Z) but parsed it with a
default jwt.Parse, which uses real time.Now() for exp validation. Once
real wall clock crossed the token's exp (now + 9m = 12:09:00Z), the
test would have flipped to a deterministic failure on every CI run.

Inject the same fixed 'now' into the parser via jwt.WithTimeFunc so
both signing and validation share one clock. Verified independently
that without the fix the parser rejects the token as 'expired', and
with the fix it accepts.

Also clarified the fetchInstallationAccount comment to be unambiguous
about what 'do not block' actually means: the HTTP call IS synchronous
(no independent timeout, pre-existing wart), but a failure here just
falls back to the unknown placeholder rather than aborting the
callback.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-05 15:21:22 +08:00
Alex
4da43b383f fix: selfhost env does not accept LARK related env (MUL-3060) (#3771)
* fix: selfhost docker compose env does not accept LARK related env

* fix(selfhost): pass through MULTICA_LARK_CALLBACK_BASE_URL for international Lark

The inbound long-conn callback bootstrap reads MULTICA_LARK_CALLBACK_BASE_URL
(server/cmd/server/router.go buildLarkConnectorFactory ->
HTTPConnectionTokenFetcher), which defaults to open.feishu.cn with no
fallback to MULTICA_LARK_HTTP_BASE_URL. Without it forwarded into the
backend container, international Lark tenants can send (outbound HTTP via
MULTICA_LARK_HTTP_BASE_URL) but never receive messages — the bootstrap
still hits the mainland host.

Forward the var in docker-compose.selfhost.yml and document all three
Lark knobs in .env.example so operators can discover them from the
standard 'cp .env.example .env' onboarding path.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-04 19:49:50 +08:00
Multica Eve
ae27058b0a fix(attachments): unified download endpoint with mode + presign + proxy (MUL-2976) (#3747)
Fix attachment download for self-hosted deployments using private S3-compatible buckets without CloudFront. Closes #3721.

**Server**

- New unified `GET /api/attachments/{id}/download` endpoint that picks CloudFront / S3 presign / server proxy at request time.
- `ATTACHMENT_DOWNLOAD_MODE=auto|cloudfront|presign|proxy` and `ATTACHMENT_DOWNLOAD_URL_TTL` env knobs; `auto` routes Docker hostnames / localhost / private IPs through the proxy and public S3 endpoints through presign.
- `Storage.PresignGet` capability; S3 implementation generates presigned GET URLs.
- `attachmentToResponse` returns the unified relative endpoint instead of leaking raw unsigned S3 URLs when CloudFront is not configured. Proxy path streams via `io.Copy` with `Content-Disposition` / `Content-Length` / `Cache-Control: no-store` / `X-Content-Type-Options: nosniff`.

**Clients**

- CLI / Desktop / Mobile resolve relative `download_url` values against the configured API base. Desktop covers the Electron native download bridge and the media preview modal; Mobile covers `Linking.openURL`, the markdown image RN loader, and the composer's completed non-image file chip.
- Mobile gains a minimal Node-environment vitest lane wired into `mobile-verify.yml`.

**Docs**

- `.env.example`, `docker-compose.selfhost.yml`, `SELF_HOSTING_ADVANCED.md`, and the `environment-variables` doc set updated with the new env keys and the `ATTACHMENT_DOWNLOAD_MODE=proxy` recommendation for Docker / VPC-internal object stores.

**Tests**

- `internal/storage`, `internal/cli`, `internal/handler` (download endpoint, mode selection, proxy header, `/content` non-regression), `cmd/server` (trusted proxy parser).
- `packages/views/editor/use-download-attachment.test.tsx` and `attachment-preview-modal.test.tsx` exercise relative URL resolution + absolute pass-through.
- `apps/mobile/lib/attachment-url.test.ts` covers every helper branch plus the composer non-image chip case.
2026-06-04 14:52:57 +08:00
Bohan Jiang
8db619c1cd fix(email): wire SMTP_EHLO_NAME through self-host config + docs [MUL-2984] (#3749)
* fix(email): wire SMTP_EHLO_NAME through self-host config + docs

Follow-up to #3679, which added SMTP_EHLO_NAME in code but never exposed
it to operators.

- docker-compose.selfhost.yml: pass SMTP_EHLO_NAME through to the backend
  container. The compose env block is an explicit allowlist, so without
  this the override set in .env was silently dropped and never reached
  the process — making the escape hatch unusable on the docker path.
- Document the var alongside its SMTP_* siblings: .env.example,
  SELF_HOSTING_ADVANCED.md, environment-variables.mdx, auth-setup.mdx,
  and self-host-quickstart.mdx (the last two with a strict-relay example).
- email.go: log when os.Hostname() fails instead of silently falling back
  to net/smtp's lazy "localhost" — the exact greeting strict relays reject.
- Add TestNewEmailService_EHLOName covering the env override, trimming,
  and the hostname fallback.

MUL-2984

Co-authored-by: multica-agent <github@multica.ai>

* fix(email): gate EHLO resolution to SMTP mode + sync docs to zh/ja/ko

Addresses review nits on this PR:

- email.go: resolve smtpEHLOName only when SMTP_HOST is set, so the
  Resend / DEV-stdout paths never call os.Hostname() or emit its
  failure log. The EHLO name is only ever used on the SMTP send path.
- docs: add SMTP_EHLO_NAME to the zh/ja/ko variants of
  environment-variables, self-host-quickstart, and auth-setup, in sync
  with the English docs updated earlier in this PR.

Note: the ja/ko self-host-quickstart and auth-setup pages were already
missing the port-465 implicit-TLS example (pre-existing i18n drift from
an earlier SMTP_TLS change, unrelated to this PR); the new EHLO block is
inserted at the correct logical anchor regardless. A full ja/ko re-sync
is left as a separate follow-up.

MUL-2984

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-04 14:44:55 +08:00
fengchangguo-star
2cf8107fc8 feat(email): support implicit TLS (SMTPS/465) for SMTP relay (MUL-2768) (#3340)
* feat(email): support implicit TLS (SMTPS/465) for SMTP relay

The SMTP relay previously only did opportunistic STARTTLS: it dialed
plaintext and upgraded if the server advertised STARTTLS. Providers that
only offer implicit TLS on port 465 and do not advertise STARTTLS (e.g.
Aliyun enterprise mail) could not be used as a relay at all.

Add an SMTP_TLS env var:
  - unset / starttls (default): unchanged STARTTLS-upgrade behavior.
  - implicit / smtps / ssl: dial with tls.DialWithDialer (SMTPS).
Implicit TLS is auto-enabled when SMTP_PORT=465 and SMTP_TLS is unset, so
the common case works with no extra config. The startup log line now
reports the negotiated mode (starttls / implicit-tls).

Co-authored-by: multica-agent <github@multica.ai>

* feat(email): plumb SMTP_TLS through selfhost compose, warn on unknown values

The backend reads SMTP_TLS but docker-compose.selfhost.yml never forwarded
it, so SMTP_TLS=implicit on a non-standard port (or an explicit starttls
override on 465) silently did nothing inside the container. Add it to the
backend.environment block.

Also log a one-line warning when SMTP_TLS is set to an unrecognized value
(e.g. "tls"/"true"/"on"), which would otherwise fall through to STARTTLS
and fail to dial a 465 SMTPS port with no startup hint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(email): cover SMTP_TLS precedence and alias resolution

Table-driven test over NewEmailService asserting the implicit-TLS decision:
465 auto-enables implicit; explicit starttls on 465 overrides auto-detect;
implicit/smtps/ssl aliases (case-insensitive, whitespace-trimmed) force SMTPS
on any port; unknown values fall back to starttls.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* docs: document SMTPS / SMTP_TLS support, drop "465 unsupported"

Port 465 implicit TLS is now supported, so the five places that said it was
unsupported are wrong. Replace those sentences, add an SMTP_TLS row to the
environment-variables tables (EN + ZH), and add a copy-pasteable SMTPS env
block to the auth-setup pages.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: guofengchang <guofengchang@cumulon.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 18:15:04 +08:00
Ivan Vinokurov
9aa8ba0191 fix(runtimes): self-host daemon setup URLs (MUL-2804) (#3474)
Expose self-host daemon setup URLs from /api/config at runtime so the Add computer dialog renders the operator's own server/app domains, while Multica Cloud defaults stay unchanged.

Fixes #3013.
2026-05-30 18:13:02 +08:00
Bohan Jiang
90ddfb04e2 feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777) (#3441)
* feat(self-host): DISABLE_WORKSPACE_CREATION env var (MUL-2777, #3433)

When self-hosters set DISABLE_WORKSPACE_CREATION=true, POST /api/workspaces
returns 403 for every caller and the UI hides every "Create workspace"
affordance (sidebar, modal, /workspaces/new page, onboarding Step 2). This
closes the gap where ALLOW_SIGNUP=false still let any signed-in user open
an isolated workspace the platform admin couldn't see.

- server: new Config.DisableWorkspaceCreation, gate in CreateWorkspace,
  workspace_creation_disabled in /api/config, Go tests.
- frontend: new workspaceCreationDisabled in configStore, hide sidebar
  entry, swap NewWorkspacePage / CreateWorkspaceModal / onboarding
  StepWorkspace to a "creation disabled, ask for invite" state when the
  flag is on, EN + zh-Hans locale strings.
- ops: .env.example, docker-compose.selfhost, helm values + configmap,
  SELF_HOSTING.md, SELF_HOSTING_ADVANCED.md, environment-variables docs
  (EN + zh).

Co-authored-by: multica-agent <github@multica.ai>

* fix(onboarding): drive create path off workspaceCreationAllowed (#3433)

PR #3441 review: when DISABLE_WORKSPACE_CREATION=true and the user already
has a workspace, StepWorkspace still walked the resume copy (`headline_resume`
/ `lede_resume` mentioning "or start another") and `creatingActive` ignored
the flag, leaving a stale clickable create CTA possible if /api/config
arrived late.

Refactor StepWorkspace to derive a single `workspaceCreationAllowed`
boolean from the config store. It now drives:

- Initial `mode` state (defaults to "existing" when disabled + reusing so
  the CTA is pre-armed for the only valid action).
- `creatingActive` so the footer CTA cannot fall back into the create
  branch even mid-render.
- Eyebrow / headline / lede strings — adds
  `creation_disabled_{eyebrow,headline,lede}_resume` (EN + zh-Hans) for
  the disabled + reusing variant.

Tests: cover the three reachable shapes — flag off + no existing, flag on
+ no existing, flag on + existing.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 16:42:08 +08:00
YOMXXX
bfb7c85491 fix(selfhost): derive local port URLs from env (MUL-2506) (#2939)
* fix(selfhost): derive local port URLs from env

* fix(selfhost): derive local script URLs
2026-05-24 13:05:53 +08:00
Jiayuan Zhang
cd37b4e3d6 feat(settings): consolidate GitHub options under a dedicated Settings tab (MUL-2414) 2026-05-19 17:23:30 +02:00
Kagura
59617f376e feat(auth): make auth token TTL configurable via AUTH_TOKEN_TTL env var (MUL-2371) (#2713)
* feat(auth): make auth token TTL configurable via AUTH_TOKEN_TTL env var

Add AUTH_TOKEN_TTL environment variable (in seconds) to override the
hardcoded 30-day auth token lifetime. Self-hosted deployments on trusted
networks can set a longer value to avoid frequent magic-link
re-authentication.

The value is read once at startup and cached. Invalid or missing values
fall back to the 30-day default with a warning log.

Closes #2685

* refactor(auth): extract parseAuthTokenTTL for testability

Address review feedback: extract pure parse function from sync.Once
wrapper so the parsing logic can be unit-tested independently.
Add TestParseAuthTokenTTL with table-driven cases.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* refactor(auth): accept Go duration strings + hoist shared TTL in SetAuthCookies

Address nice-to-have review feedback from Bohan-J:
- parseAuthTokenTTL now tries time.ParseDuration first (e.g. '8760h'),
  falling back to ParseInt for integer seconds
- Warn on unreasonable values (>10 years) but still accept them
- Hoist AuthTokenTTL() and time.Now() in SetAuthCookies so both
  cookies share the exact same expiry
- Add security trade-off note in .env.example
- Add 5 new test cases for duration strings

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>

* fix: use AuthTokenTTL() in CloudFront middleware, guard ParseInt overflow

Address review feedback from Bohan-J (round 2):

1. CloudFront refresh middleware (cloudfront.go:21) was hardcoding
   30*24*time.Hour instead of using auth.AuthTokenTTL(). Now calls
   AuthTokenTTL() so the middleware respects AUTH_TOKEN_TTL env var.

2. parseAuthTokenTTL integer-seconds branch: very large values like
   9999999999 would silently overflow int64 when multiplied by
   time.Second. Added overflow guard comparing against
   math.MaxInt64/int64(time.Second) before the multiplication.

3. Updated AuthTokenTTL() doc comment to reflect that it accepts
   Go duration strings or integer seconds (not just seconds).

4. Added middleware test (cloudfront_test.go) verifying short
   AUTH_TOKEN_TTL produces short cookie expiry, not 30-day hardcode.
   Also covers nil signer and existing-cookie-skip cases.

5. Added integer overflow test case to cookie_test.go.

* style: run gofmt on cookie.go and cookie_test.go

---------

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
2026-05-19 16:22:07 +08:00
Bohan Jiang
eb5c6d7547 docs(self-host): document auth rate-limit env keys (#2773)
Adds REDIS_URL, RATE_LIMIT_AUTH, RATE_LIMIT_AUTH_VERIFY, and
RATE_LIMIT_TRUSTED_PROXIES to the environment-variables page (EN +
ZH) and to .env.example, with the reverse-proxy caveat that without
RATE_LIMIT_TRUSTED_PROXIES every user shares the proxy IP and the
whole deployment ends up in one bucket.

Follow-up to #2636. MUL-2251.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 13:11:17 +08:00
johnhu-1237
79dd066363 fix env example websocket origin (#2599) 2026-05-18 12:38:52 +08:00
Kerim Incedayi
9418d2a2c1 feat(autopilots): webhook triggers (server + CLI + UI + docs) MUL-2049 (#2348)
* feat(server): add webhook trigger DB migration + sqlc queries

Lays the foundation for webhook autopilot triggers:
- partial unique index on autopilot_trigger.webhook_token (kind=webhook only)
  so the public ingress route can resolve a trigger in O(1)
- GetWebhookTriggerByToken / TouchAutopilotTriggerFiredAt /
  RotateAutopilotTriggerWebhookToken / SetAutopilotTriggerWebhookToken
  queries, regenerated with sqlc

* feat(server): webhook token generator + payload normalizer

Two pure helpers for the webhook autopilot work:
- generateWebhookToken: 32 random bytes -> base64-url, "awt_" prefix.
  256 bits of entropy keeps brute-force off the table; the prefix makes
  leaked tokens recognisable in logs.
- normalizeWebhookPayload: turns arbitrary JSON into the WebhookEnvelope
  shape (event/eventPayload/request) used by trigger_payload. Header- and
  body-based event inference covers GitHub, GitLab, X-Event-Type, and
  caller-provided envelopes; scalar/empty/invalid bodies are rejected so
  the handler can answer 400.

* feat(server): generate webhook tokens and expose rotate endpoint

- New handler.Config.PublicURL fed by MULTICA_PUBLIC_URL env so
  /api/autopilots/.../triggers responses can include an absolute
  webhook_url alongside the always-present webhook_path.
- CreateAutopilotTrigger now mints a webhook_token via crypto/rand
  for kind=webhook and ignores cron/timezone for non-schedule kinds.
  api triggers stay accepted-but-inert per PLAN.md.
- New POST /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token
  protected by the existing workspace auth group; old tokens stop
  working immediately because the unique-index lookup keys on the
  current row value.

* feat(server): public webhook ingress route + per-token rate limiter

- New POST /api/webhooks/autopilots/{token} route, mounted outside the
  authenticated group: the path token is the credential. Workspace
  context is derived from the joined autopilot row, never headers.
- Body capped at 256 KiB via http.MaxBytesReader; oversized payloads
  return 413 mid-read instead of being fully buffered.
- Disabled triggers / paused / archived autopilots return
  200 {"status":"ignored"} so providers stop retrying.
- Skipped-runtime dispatches surface 200 {"status":"skipped"} with the
  reason from the autopilot service's pre-flight admission check.
- WebhookRateLimiter interface with sliding-window in-memory + Redis
  Lua-script implementations. Default 60 req/min per token. Test
  coverage on the in-memory path; Redis variant fails open on cache
  errors so a Redis hiccup never blocks ingress.
- Integration tests exercise token generation, dispatch, payload
  envelope persistence, GitHub-header inference, paused/disabled
  short-circuits, oversized rejection, and rotate-then-old-token-404.

* feat(server): include webhook payload in create_issue description

When an autopilot run is triggered by a webhook and execution_mode is
create_issue, the agent only sees the issue body — never the run's
trigger_payload. Append a 'Webhook event:' line and a fenced JSON block
with the normalized eventPayload so the agent has the inbound context
inline. Schedule / manual runs are unchanged.

Tests cover:
  - schedule path keeps existing italic note, no webhook block
  - webhook path emits event line + payload block, italic before block
  - non-envelope JSON falls back to raw body (defensive)
  - non-webhook source with payload still gets no webhook block

* feat(core): types, API client and mutations for webhook triggers

- AutopilotRunStatus gains 'skipped' so the run-list UI handles the
  admission-skipped state explicitly instead of falling through to a
  generic case (the backend already emits it via MUL-1899).
- AutopilotTrigger picks up optional webhook_path / webhook_url. Both
  are optional so older self-hosted servers that pre-date this change
  still parse cleanly.
- buildAutopilotWebhookUrl helper composes a usable absolute URL with
  the priority webhook_url > apiBaseUrl + path > origin + path > path.
  Tested with seven cases covering each branch.
- ApiClient.rotateAutopilotTriggerWebhookToken posts to
  /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token; the
  HTTP-contract test pins URL + method.
- useRotateAutopilotTriggerWebhookToken mutation invalidates
  autopilotKeys.detail on settle, mirroring the existing trigger-mutation
  pattern.

* feat(views): webhook trigger UI in Add Trigger dialog and trigger row

Add Trigger dialog gains a Schedule/Webhook segmented toggle:
  - Schedule reuses TriggerConfigSection unchanged.
  - Webhook hides the cron config and shows a help line; the trigger is
    created with kind=webhook and the URL is generated server-side.
  - Toast text differentiates schedule vs webhook on success.

TriggerRow grows a webhook branch:
  - Webhook icon, kind translated via trigger_kind.
  - URL shown in a truncating monospace pill, with copy + rotate
    buttons. Copy uses navigator.clipboard with toast feedback; rotate
    uses an AlertDialog confirm because the old URL stops working
    immediately.
  - api triggers render a Deprecated badge and skip URL/copy/rotate
    affordances.

RunRow gains a 'skipped' RUN_VISUAL entry (muted dash) so admission-
skipped runs don't fall through to a generic case. Source label uses the
new run_source i18n key instead of capitalize.

Locales: en + zh-Hans gain run_status.skipped, run_source.*,
trigger_kind.*, trigger_row.{copy_url,rotate_url,*_confirm_*,toast_*},
add_trigger_dialog.{type_*,webhook_help,toast_added_{schedule,webhook}}.

* feat(cli): support webhook trigger creation and URL rotation

- multica autopilot trigger-add now takes --kind schedule|webhook
  (default schedule for backward compatibility). For webhook it skips
  --cron / --timezone validation and prints the resulting webhook URL,
  preferring the server-provided webhook_url and falling back to
  client.BaseURL + webhook_path.
- New multica autopilot trigger-rotate-url <autopilot-id> <trigger-id>
  command for rotating the bearer URL of a webhook trigger.

* docs(autopilots): add webhook trigger guide (en + zh)

Replaces the 'Webhook and API triggers are not available yet' section
with end-to-end webhook documentation: how the URL is generated, what
payload shapes are accepted, the inferred-event rules, the bearer-secret
warning + rotate flow, status-code semantics for accepted/skipped/
ignored/4xx/5xx outcomes, and the MULTICA_PUBLIC_URL self-host
configuration.

Run history list now mentions skipped status. The 'unavailable
features' section narrows to api-kind triggers, HMAC signing, IP
allowlists, and provider presets.

* feat(views): add Schedule/Webhook toggle to the create autopilot dialog

Closes the gap where a brand-new autopilot could only be created with a
schedule trigger. The right-column config now has a Trigger section
with a segmented Schedule/Webhook control:
  - Schedule keeps the existing cron/timezone UI.
  - Webhook hides the cron UI and shows a help line; on submit, a
    kind=webhook trigger is created right after the autopilot.

In edit mode the toggle is intentionally hidden (PLAN.md treats trigger-
type changes as delete-old + create-new, not in-place updates), but the
panel still picks the right kind based on props.triggers[0].kind so a
webhook autopilot doesn't render an irrelevant cron form.

Locales: section_trigger_kind, trigger_kind_{schedule,webhook},
section_webhook, webhook_help_{create,edit} added in en + zh-Hans.

* feat(views): show webhook URL inline after creating a webhook autopilot

After a successful create with kind=webhook, the dialog stays open and
swaps to a confirmation panel showing the freshly minted URL with a
copy button + 'Treat this URL like a password' warning + Done button.
Avoids the friction of "create the autopilot, then go find it in the
list, click in, scroll to triggers, copy URL."

Locales: dialog.webhook_created_{title,description,warning,done} added
in en + zh-Hans.

Schedule create flow is unchanged (toast + close). The success panel is
gated on the trigger returned from the create mutation, so a partial
failure (autopilot created, trigger creation errored) still falls
through to the toast_create_partial path.

* feat(views): show webhook payload in run detail dialog

The agent transcript dialog now accepts an optional headerSlot that
sits above the event list. The autopilot RunRow drops a
WebhookPayloadPreview into that slot when the run came from a webhook
and trigger_payload is non-empty.

The preview is collapsed by default (the transcript itself is the main
event), shows the inferred event name + receivedAt in the header, and
reveals the eventPayload as pretty-printed JSON with a copy button on
expand. Falls back gracefully if the row's trigger_payload doesn't
match the WebhookEnvelope shape — the whole value is shown instead so
nothing is hidden.

Closes the "agent didn't echo the payload, now I can't see what
triggered the run" gap. PLAN.md tracked this as
"Payload preview in run history" under follow-ups.

Locales: webhook_payload.{label, unknown_event, payload, content_type,
copy, copied, copied_short, copy_failed} added in en + zh-Hans.

* chore(server): wire MULTICA_PUBLIC_URL through self-host compose

Two small follow-ups split out of the webhook trigger PR:

- docker-compose.selfhost.yml passes MULTICA_PUBLIC_URL into the
  backend container so a self-hosted deployment behind a real domain
  gets absolute webhook URLs in the trigger response. Documented in
  .env.example with the rationale for not deriving the public host
  from request headers.
- Drop a duplicated 'invalid json:' prefix in the webhook ingress
  400 error path. normalizeWebhookPayload already prefixes its
  errors, so the handler doesn't need to re-prefix.

* fix(migrations): renumber webhook trigger migration 081 → 089 to avoid collision

The branch's 081_autopilot_webhook_triggers.{up,down}.sql collided
numerically with 081_runtime_timezone.{up,down}.sql that landed on
main, making migration apply order undefined. Renumber to 089 so the
file slots after the latest main migration (088_squad_instructions).

The SQL itself doesn't conflict — it only creates a partial unique
index on autopilot_trigger.webhook_token — but the duplicate prefix
is what the migration runner sees, so the filename must move.

* fix(autopilot-webhook): address PR review blocking issues

- Redact bearer tokens from request logs: paths matching
  /api/webhooks/autopilots/<token> now log "[redacted]" instead of the
  token. The resolved trigger ID is plumbed via context so audit lines
  stay useful for debugging. (Review item Blocking #1.)
- Distinguish pgx.ErrNoRows from transient DB errors in token lookup:
  no-row stays 404 (so providers don't retry on a deleted webhook),
  other errors return 500 (which providers DO retry, avoiding silent
  drops on DB blips). (Review item Blocking #2.)
- Add per-IP sliding-window rate limiter that runs BEFORE the token
  lookup, so spraying random tokens can no longer probe the
  autopilot_trigger index unboundedly. Reuses the existing Lua script
  with a separate Redis key namespace; falls open on Redis errors.
  Default budget 30 req/min/IP. (Review item Blocking #3.)

The webhook handler now applies the gates in the order: per-IP rate
limit → token lookup → per-token rate limit → handler logic.

* fix(autopilot): atomic webhook trigger creation + strict kind/timezone validation

- Mint the webhook bearer token BEFORE the INSERT and pass it via
  CreateAutopilotTriggerParams so the row never exists in a half-written
  kind=webhook + webhook_token=NULL state. On the (vanishingly rare)
  unique-index collision the whole INSERT is retried with a fresh token
  — no UPDATE second step. Removes the now-dead attachFreshWebhookToken
  helper. (Review item Recommended #4.)
- Add new GET /api/autopilots/{id}/runs/{runId} endpoint that returns a
  single run including the full trigger_payload. The list response is
  now slim (omits trigger_payload) so worst-case payload size drops
  from ~5 MB to ~5 KB. (Review item Recommended #5, server side.)
- Reject kind=api with 400 ("kind=api is deprecated; use schedule or
  webhook") and reject kind=webhook with --timezone with 400 — both
  surfaces stragglers loudly instead of silently dropping fields.
  CLI mirrors the check so --timezone with --kind webhook errors
  client-side. (Review nits.)
- Add --yes (-y) flag and an interactive y/N confirmation prompt to
  `multica autopilot trigger-rotate-url` so the destructive rotate
  matches the UI's AlertDialog safety. (Review item Recommended #6.)

* fix(views): fetch webhook payload on-demand and truncate at 4 KiB

- Add useAutopilotRun query hook + getAutopilotRun API client method
  paired with the new server endpoint. The run-detail dialog now mounts
  a WebhookPayloadSlot that fetches the full run (incl. trigger_payload)
  lazily — list responses no longer carry up to 256 KiB × N runs of
  envelope data.
- WebhookPayloadPreview truncates its in-DOM <pre> at 4 KiB with a
  localized marker so jank-y machines aren't asked to render a 256 KiB
  JSON blob. The Copy button still yields the full string.
- Adds the truncated_marker i18n string to en + zh-Hans.

Review items Recommended #5 (frontend) and a nit on the preview's
unbounded <pre>.

* test(autopilot-webhook): close coverage gaps flagged in PR review

- request_logger: redactWebhookPath unit tests + integration test
  proving the bearer token never lands in slog output, plus the
  webhook_trigger_id context plumbing.
- autopilot_webhook_handler: empty body → 400, archived autopilot →
  200 ignored, per-IP rate limiter trips before DB lookup, kind=api
  and webhook+timezone are rejected at 400, slim list + full detail
  endpoint round-trip.
- webhook_rate_limiter: Lua script structure guard (catches reordering
  even without a live Redis), plus live-Redis tests for both per-token
  and per-IP limiters (REDIS_TEST_URL gated, matching the existing
  Redis test pattern in the package).
- WebhookPayloadPreview: envelope rendering, fallback shape, and the
  >4 KiB truncation path with full-payload-on-Copy guarantee.

Two branches are documented as code-review-protected rather than
covered by tests: the 500-on-DB-error path requires injecting a stub
Queries (no interface here), and the cross-workspace defense-in-depth
check is unreachable from valid SQL state.

* fix(middleware): SetWebhookTriggerID must mutate request in place

The round-1 helper returned a fresh *http.Request from WithContext, and
the webhook handler did `r = SetWebhookTriggerID(r, ...)`. That swaps
the handler's local pointer but doesn't propagate the new context back
to RequestLogger, which is still holding the original *http.Request —
so the audit line never actually included webhook_trigger_id in
production. The round-1 test happened to pass because it pre-stashed
the value on the request before calling ServeHTTP, bypassing the bug
it was meant to verify.

Switch to in-place mutation via `*r = *r.WithContext(...)` so the
wrapping middleware sees the new context after next.ServeHTTP returns,
and update the test to exercise the real call pattern (set the context
from inside the handler, assert the surrounding logger reads it).

Verified live: an accepted webhook now logs
  path=/api/webhooks/autopilots/[redacted] webhook_trigger_id=<uuid>

* fix(autopilot-webhook): symmetric ErrNoRows split + trusted-proxy gate

Round-2 review (Bohan-J, PR #2348 follow-up):

- Must-fix #1: the second lookup at autopilot_webhook.go:258
  (GetAutopilot after the token resolves) was folding every error into
  404. A transient DB blip would tell a webhook sender "not found" and
  it would never retry. Apply the same errors.Is(err, pgx.ErrNoRows)
  → 404 / else → 500 split as the first lookup got in round 1.

- Must-fix #2: clientIPForRateLimit was honoring X-Forwarded-For /
  X-Real-IP from any caller. An attacker spraying random tokens could
  just rotate the XFF header and the per-IP bucket became per-request,
  so the limiter that's specifically supposed to gate spraying before
  it hits the DB unique index was bypassed.

  New shape — matches Bohan's suggestion exactly:
  * Default: r.RemoteAddr only, headers ignored.
  * Operator opt-in via MULTICA_TRUSTED_PROXIES (comma-separated
    CIDRs). XFF/X-Real-IP are honored only when r.RemoteAddr is
    inside one of the listed prefixes; otherwise they're dropped.

  Wired through .env.example and docker-compose.selfhost.yml so
  self-host operators can configure their reverse-proxy's CIDR.
  Invalid CIDRs in the env var are dropped with a single slog.Warn at
  startup rather than crashing the server. Uses net/netip (stdlib,
  value-typed) for parsing and containment checks.

Verified live on the rebuilt self-host backend: a 35-request spray
from one source with rotating XFF gets the expected 30× 404 + 5× 429,
proving the per-IP bucket is keyed on the real connection IP.

* fix(autopilot): reject cron/timezone PATCH on non-schedule triggers

Round-2 review should-fix. CreateAutopilotTrigger already 400s on
kind=webhook + timezone/cron_expression, but UpdateAutopilotTrigger
silently wrote those fields regardless of prev.Kind. The values then
sat in the DB visible to nobody and read by nothing — a back door that
left the API contract fuzzy across create vs update.

Mirror the create-path discipline: after loading prev, if prev.Kind
!= "schedule" and the PATCH body sets cron_expression or timezone,
return 400 with a clear message. enabled and label remain accepted on
every kind.

The existing prev.Kind == "schedule" guard on next_run_at recompute
stays as belt-and-braces, but with this gate in place the recompute
branch is now reachable only for the kind it was meant for.

* test(autopilot-webhook): close round-2 coverage gaps

- IPRateLimitNotBypassedByXFFSpoof: drives the must-fix #2 invariant
  by rotating XFF across three calls from the same RemoteAddr and
  asserting the third gets 429. Pre-round-2 this test would have
  passed for the wrong reason (limiter trusted XFF, so per-bucket
  collision was incidental); now it pins the bypass-closed property.
- IPRateLimitReturns429BeforeDBLookup: updated to set RemoteAddr
  explicitly and drop the XFF header it was leaning on. With
  TrustedProxies empty (test default) the limiter keys on the real
  connection IP, which is what the test wants to assert anyway.
- UpdateAutopilotTrigger_RejectsCronExpressionOnWebhookKind +
  UpdateAutopilotTrigger_RejectsTimezoneOnWebhookKind: drive the
  round-2 should-fix from the handler boundary.
- UpdateAutopilotTrigger_AcceptsEnabledAndLabelOnWebhookKind: counter
  test so a regression to a blanket reject is caught.

* fix(migrations): bump webhook trigger migration 089 → 091

origin/main added 089_squad_no_action_activity_index (and 090_task_is_leader)
since our last rebase, re-colliding with our 089_autopilot_webhook_triggers.
Bump to 091 so the filename ordering is unambiguous again. The SQL is
unchanged — same partial unique index on autopilot_trigger.webhook_token —
only the filename moves.

* fix(views): dedupe skipped icon in autopilot RUN_VISUAL after rebase

The rebase against origin/main merged main's add of `Ban` for the
skipped status next to our round-1 `MinusCircle` entry, leaving the
RUN_VISUAL map with two `skipped` keys (only the last would have been
read at runtime, and MinusCircle had been dropped from the imports
during conflict resolution — so the file would not compile).

Keep main's `Ban` icon (latest design) and a single `skipped` entry.
Carry over the round-1 comment about why the muted styling matters
for failure-ratio readability.

---------

Co-authored-by: Kerim Incedayi <kerim.incedayi@digitalchargingsolutions.com>
2026-05-18 12:17:39 +08:00
Bohan Jiang
a23856bae3 MUL-1624 docs(email): clarify 888888 is opt-in; document SMTP option (#2666)
* docs(email): clarify 888888 is opt-in via MULTICA_DEV_VERIFICATION_CODE; document SMTP option in self-host docs

The startup log line, .env.example, and SELF_HOSTING_ADVANCED.md still
implied that the dev master code 888888 is auto-active whenever
APP_ENV != "production". That has not been true since the master code
was gated behind MULTICA_DEV_VERIFICATION_CODE — the fixed code is
disabled by default and must be opted in explicitly.

Also extend the docs site with the SMTP relay backend added in #1877:
auth-setup, environment-variables, and self-host-quickstart now cover
both Resend and SMTP options in EN and ZH.

Co-authored-by: multica-agent <github@multica.ai>

* docs(email): treat SMTP as an email backend in self-host docs and startup warning

Address review feedback on #2666:

- server: startup warning now fires only when both RESEND_API_KEY and SMTP_HOST
  are empty, since either one is a valid email backend. Otherwise the log
  mis-tells SMTP-only operators that verification codes go to stdout.
- self-host-quickstart (EN/ZH): tell readers to fetch the verification code
  from whichever backend they configured (Resend or SMTP); fall back to
  stdout only when neither is configured.
- auth-setup (EN/ZH): \"without Resend\" → \"without any email backend
  configured\" so the wording stays correct now that SMTP is a first-class
  option.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 14:18:46 +08:00
apollion69
35e9a7f0f6 feat(email): add SMTP relay as alternative to Resend for self-hosted deployments (#1877)
* feat(email): add SMTP relay as alternative to Resend

Self-hosted deployments often run behind a corporate firewall with an
existing SMTP relay (Exchange, Postfix, sendmail) and no access to
external SaaS APIs. Resend requires a public domain, an API key, and
outbound HTTPS to api.resend.com — all unavailable in air-gapped or
private-network setups.

This adds a second email delivery path using Go's stdlib net/smtp,
activated when SMTP_HOST is set. Priority order:
  1. SMTP relay  (SMTP_HOST set)
  2. Resend API  (RESEND_API_KEY set)
  3. DEV stdout  (neither set)

New env vars (all optional, no breaking change):
  SMTP_HOST            — SMTP server hostname
  SMTP_PORT            — port, default 25
  SMTP_USERNAME        — for authenticated SMTP; empty = unauthenticated relay
  SMTP_PASSWORD        — used only when SMTP_USERNAME is set
  SMTP_TLS_INSECURE    — set to "true" to skip TLS cert verification
                         (for private CA / self-signed certs)

The implementation:
- Dials TCP, creates smtp.Client manually (avoids smtp.SendMail which
  does not expose TLS config)
- Tries STARTTLS if advertised; uses InsecureSkipVerify only when
  SMTP_TLS_INSECURE=true (opt-in, nolint:gosec annotated)
- Applies PlainAuth only when SMTP_USERNAME is non-empty
- Wraps all errors with context for easier debugging
- Reuses existing HTML templates from buildInvitationParams for
  invitation emails (no template duplication)

Also updates .env.example and docker-compose.selfhost.yml with the
new variables and inline documentation.

* fix(email): add dial timeout, session deadline, RFC headers for SMTP path

Address review blockers from multica-eve and Bohan-J (PR #1877):

- net.Dial → net.DialTimeout(10s) + conn.SetDeadline(30s) so a blackholed
  SMTP relay cannot hang SendVerificationCode (called synchronously from the
  auth handler) or leak goroutines in the invitation path.
- Add Date, Message-ID, and proper Content-Transfer-Encoding headers.
  Date is required by RFC 5322; many strict relays reject messages without it.
  Message-ID aids deliverability and threading.
- MIME-encode Subject via mime.QEncoding so non-ASCII workspace/inviter names
  (CJK, emoji) survive without corruption across any RFC 2047-conformant relay.
- Probe 8BITMIME after (possible) STARTTLS: use Content-Transfer-Encoding 8bit
  when the relay advertises 8BITMIME, quoted-printable otherwise — safe for
  all relay configurations without forcing base64 overhead.
- Update SELF_HOSTING_ADVANCED.md to document Option B (SMTP relay) alongside
  the existing Resend section, including all five env vars and a note that
  port 465/SMTPS is not yet supported.

* fix(email): correct has8Bit assignment order (bool is first return of Extension)
2026-05-15 13:35:01 +08:00
Bohan Jiang
eca36fac84 fix(github): plumb GITHUB_APP_SLUG / GITHUB_WEBHOOK_SECRET through self-host (#2482)
The GitHub App integration code reads these two env vars and only enables
the Connect flow when both are set. .env.example never listed them, and
docker-compose.selfhost.yml did not forward them into the backend
container, so self-hosters following the integration docs had no working
way to turn the feature on.

MUL-2107

Co-authored-by: multica-agent <github@multica.ai>
2026-05-12 18:40:17 +08:00
Multica Eve
ce00e05169 Add canonical PostHog core metrics events (#2302)
* Add canonical PostHog core metrics events

Co-authored-by: multica-agent <github@multica.ai>

* Address analytics review feedback

Co-authored-by: multica-agent <github@multica.ai>

* Tighten analytics review follow-ups

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 13:12:00 +08:00
Bohan Jiang
89b939b07c fix(storage): build region-qualified S3 public URLs (#2051) (#2065)
* fix(storage): build region-qualified S3 public URLs (#2051)

The uploadedURL fallback (no CloudFront, no custom endpoint) wrote
"https://<bucket>/<key>" — missing the ".s3.<region>.amazonaws.com"
suffix — so any deployment that pointed S3_BUCKET at a real AWS bucket
without a CDN got broken image URLs back to the client. Avatar URLs
were persisted in this broken form on the user/agent rows, so profile
pictures uploaded via the SDK never rendered.

- Track S3_REGION on S3Storage and emit
  https://<bucket>.s3.<region>.amazonaws.com/<key> by default;
  fall back to path-style https://s3.<region>.amazonaws.com/<bucket>/<key>
  when the bucket name contains dots, since the AWS wildcard cert
  can't validate dotted virtual-hosted hosts.
- Teach KeyFromURL to recognise the new region-qualified hosts (both
  styles) and keep recognising the legacy bucket-only host so historical
  records can still be deleted/migrated.
- Document that S3_BUCKET is the bucket name only, not a hostname,
  in env-vars docs (en+zh), self-hosting guides, and .env.example.

Co-authored-by: multica-agent <github@multica.ai>

* feat(storage): warn at startup when S3_BUCKET looks like a hostname

Catches the most common misconfiguration shape (S3_BUCKET set to
"<bucket>.s3.<region>.amazonaws.com") with a startup log line so
operators don't silently end up with a config that signs uploads
against an invalid bucket name.

A real bucket name can never legitimately contain "amazonaws.com",
so the check is a single substring match — no false positives
worth carving out.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-06 12:45:55 +08:00
devv-eve
6ef711cd35 fix: gate dev verification code behind explicit env (#1773)
* fix: gate dev verification code behind explicit env

* docs: fold dev verification code into env table

* docs: clarify fixed verification code opt-in

---------

Co-authored-by: Eve <eve@multica.ai>
2026-04-28 15:14:07 +08:00
devv-eve
f864a07bd5 feat: add server Prometheus metrics endpoint
Add Prometheus metrics endpoint with local-bind listener support and baseline metrics collectors.
2026-04-28 14:29:01 +08:00
LinYushen
99154d97b9 Restrict /health/realtime metrics exposure (MUL-1342) (#1608)
* Restrict /health/realtime metrics exposure (MUL-1342)

The realtime metrics endpoint was registered on the public router with
no authentication, exposing per-event/per-scope counters, redis.last_error,
and redis.node_id to anonymous callers. This enables information disclosure
and traffic profiling.

Move the handler behind a token + loopback policy:

- If REALTIME_METRICS_TOKEN is set, require Authorization: Bearer <token>
  using a constant-time compare. Reject other callers with 401 plus a
  WWW-Authenticate hint.
- If the env var is unset, only serve loopback callers and return 404 to
  remote clients so the endpoint is not enumerable. This keeps local dev
  workflows working without configuration.

The handler is extracted into health_realtime.go with focused unit tests
covering the token, loopback, and rejection paths. .env.example documents
the new variable.

Refs: https://github.com/multica-ai/multica/issues/1606

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fail closed for proxied /health/realtime requests (MUL-1342)

Addresses review on PR #1608: when the server runs behind a reverse
proxy (Caddy / Nginx -> localhost:8080), public callers reach the Go
handler with RemoteAddr=127.0.0.1, so the previous loopback shortcut
exposed the metrics surface in self-hosted deployments.

The no-token path now treats any forwarding header
(X-Forwarded-For / -Host / -Proto, X-Real-Ip, Forwarded) as a
'this request was proxied, can't attribute, fail closed' signal and
returns 404. Direct loopback callers without those headers still work
for local dev. Token-gated path is unchanged.

Tests cover all listed proxy headers (incl. multi-hop XFF chain and
RFC 7239 Forwarded) over both 127.0.0.1 and ::1, plus a regression
case ensuring an empty/whitespace forwarding header does not break
direct loopback access. .env.example updated to call out that proxied
deployments must configure REALTIME_METRICS_TOKEN.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: CC-Girl <cc-girl@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 14:04:10 +08:00
devv-eve
fbf41bde73 feat(selfhost): ship public GHCR deployment flow
Publish stable GHCR self-host images, switch self-host deploys to official image pulls with a source-build fallback, and move self-host signup / Google OAuth config onto runtime /api/config.
2026-04-22 16:58:42 +08:00
devv-eve
637bdc8eb3 feat(analytics): full PostHog pipeline + 6 funnel events (MUL-1122) (#1367)
* feat(analytics): add PostHog client with async batch shipping

Introduces server/internal/analytics, the shipping layer for the product
funnel defined in docs/analytics.md. Capture is non-blocking — events are
enqueued into a bounded channel and a background worker batches them to
PostHog's /batch/ endpoint. A broken backend drops events rather than
blocking request handlers.

Local dev and self-hosted instances run a noop client until the operator
sets POSTHOG_API_KEY. This is PR 1 of MUL-1122; signup and workspace_created
emission land in the follow-up commit so this change is independently
reviewable.

* feat(server): emit signup and workspace_created analytics events

Wires analytics.Client through handler.New and main, then emits the first
two funnel events:

- signup fires from findOrCreateUser (which now reports isNew), covering
  both the verification-code and Google OAuth entry points — a single
  emission site guarantees Google signups aren't missed.
- workspace_created fires after the CreateWorkspace transaction commits,
  with is_first_workspace computed from a post-commit ListWorkspaces count
  so we can distinguish fresh-user activation from returning-user
  expansion.

Tests use analytics.NoopClient so nothing ships from test runs. PR 1 of
MUL-1122; runtime_registered and issue_executed follow in later PRs per
the plan.

* refactor(analytics): drop is_first_workspace from workspace_created

Stamping "is this the user's first workspace?" at emit time races under
concurrent CreateWorkspace requests: two transactions committing close
together can both read a post-commit count greater than one and both emit
false. Fixing it at the SQL layer requires a schema change we don't want in
PR 1.

PostHog answers the same question exactly from the event stream (funnel on
"first time user does X" / cohort on $initial_event), so removing the
property loses no information and makes the emit side race-free.

* docs(analytics): document self-host safety defaults

Spell out why self-hosted instances never ship events upstream by default
(empty POSTHOG_API_KEY → noop client) and explain how operators can point
at their own PostHog project without any code change.

* feat(analytics): emit runtime_registered, issue_executed, team_invite_*

Three server-side funnel events, all gated on first-time state transitions
so retries and re-runs don't inflate the WAW buckets:

- runtime_registered fires from DaemonRegister when UpsertAgentRuntime
  reports (xmax = 0) — i.e. the row was inserted, not updated. Heartbeats
  and re-registrations stay silent.
- issue_executed fires from CompleteTask after an atomic
  UPDATE issue SET first_executed_at = now() WHERE id = $1 AND
  first_executed_at IS NULL flips the column for the first time. Retries,
  re-assignments, and comment-triggered follow-up tasks hit the WHERE
  clause and no-op. Carries nth_issue_for_workspace so the ≥1/≥2/≥5/≥10
  buckets filter without extra queries.
- team_invite_sent fires from CreateInvitation and team_invite_accepted
  from AcceptInvitation, closing the expansion funnel.

Adds a 050 migration for issue.first_executed_at plus a partial index so
the workspace-scoped executed-count query doesn't scan the never-executed
tail.

* feat(config): surface PostHog key via /api/config

Extends AppConfig with posthog_key / posthog_host sourced from env on
every request (so operators can rotate the key via secret refresh without
a restart). Reading the key off the server — rather than baking it into
the frontend bundle via NEXT_PUBLIC_* — means self-hosted instances
inherit the blank key automatically and never ship events upstream.

* feat(analytics): wire posthog-js identify + UTM capture on the client

Adds @multica/core/analytics — a thin wrapper around posthog-js that owns
attribution capture and identity merge. Posthog-js config comes from
/api/config (not NEXT_PUBLIC_*), so self-hosted instances whose server
returns an empty key automatically run the SDK inert.

captureSignupSource stamps a multica_signup_source cookie with UTM params
and the referrer's origin (never the full referrer — that can leak OAuth
code/state in the callback URL). The backend signup event reads this
cookie on new-user creation.

Identity flows:
- auth-initializer fires identify() right after getMe() resolves, on both
  cookie and token paths. A getConfig/getMe race is handled by buffering
  a pending identify inside the analytics module and flushing it once
  initAnalytics finishes.
- auth store calls identify() on verifyCode / loginWithGoogle /
  loginWithToken and resetAnalytics() on logout so the next login merges
  cleanly without bleeding events.

* docs(analytics): describe runtime_registered, issue_executed, invite events

Fills in the schema for the remaining funnel events. Captures the
design commentary that belongs next to the contract rather than in a PR
description — in particular why issue_executed uses the atomic
first_executed_at flip instead of counting task-terminal events, and why
runtime_registered relies on xmax = 0 rather than a query-then-write.

* fix(analytics): drop non-atomic nth_issue_for_workspace from issue_executed

Computing the workspace's Nth-issue ordinal at emit time is not atomic
under concurrent first-completions — two transactions can both run
MarkIssueFirstExecuted, then both run CountExecutedIssuesInWorkspace, and
both observe count=1 before either has committed, so both events go out
stamped as n=1. Serialising it would mean a per-workspace advisory lock
or a SERIALIZABLE-isolated tx; PostHog answers the same question exactly
at query time via row_number() partitioned by workspace_id, so the
emit-time property adds risk without adding information.

Removes the property from analytics.IssueExecuted, deletes the unused
CountExecutedIssuesInWorkspace query, and regenerates sqlc. The partial
index stays — any future workspace-scoped executed-issue query will want
it.

* fix(analytics): wire $pageview and harden signup_source cookie payload

Two frontend fixes from the PR review:

- PageviewTracker, mounted under WebProviders, fires capturePageview on
  every Next.js App Router path / query-string change. Without this the
  capturePageview helper in @multica/core/analytics was never called and
  the acquisition funnel's / → signup step was empty.
- captureSignupSource now caps each UTM / referrer value at 96 chars
  *before* JSON.stringify, and drops the whole cookie when the serialised
  payload still exceeds 512 chars. Previously the overall slice(0, 256)
  could leave a half-JSON string on the wire that neither the backend nor
  PostHog could parse.

Both capturePageview and identify now buffer a single pending call when
fired before initAnalytics resolves — otherwise the initial "/" pageview
and same-turn login identify race the /api/config fetch and get dropped.
resetAnalytics clears both buffers so a logout→login cycle stays clean.

* fix(analytics): URL-decode signup_source cookie on read

Go does not URL-decode Cookie.Value automatically, so the frontend's
JSON-then-encodeURIComponent payload was landing in PostHog as
percent-encoded garbage (%7B%22utm_source...). Unescape on read so the
backend receives the original JSON string the frontend intended, and
drop values that fail to decode or exceed the server-side cap — sending
truncated garbage is worse than sending nothing. Oversized-cookie guard
matches the frontend's SIGNUP_SOURCE_MAX_LEN.

* docs(analytics): reflect nth-issue drop, $pageview wiring, cookie encoding

Pulls the schema doc back in line with the code: issue_executed no longer
advertises nth_issue_for_workspace (with a note about why PostHog derives
it at query time instead), the frontend $pageview section names the
actual PageviewTracker component that fires it, and the signup_source
section documents the per-value cap / overall drop rule and the
encode-on-write / decode-on-read contract.

---------

Co-authored-by: Jiang Bohan <bhjiang@outlook.com>
2026-04-21 14:42:52 +08:00
Bohan Jiang
824d943848 fix(auth): derive cookie Secure flag from FRONTEND_ORIGIN scheme (#1390)
The session cookie's Secure flag was tied to APP_ENV, and the
docker-compose self-host stack defaults APP_ENV to "production". On
plain-HTTP self-host deployments (LAN IP, private network) the browser
silently drops Secure cookies, leaving every subsequent /api/* call
anonymous and surfacing as 401 "auth: no token found" right after a
successful login.

Derive Secure from the scheme of FRONTEND_ORIGIN so HTTPS origins get
Secure cookies and plain-HTTP origins get non-secure cookies the
browser will actually store. Also harden cookieDomain() against the
other common trap: COOKIE_DOMAIN=<ip>, which RFC 6265 forbids and
browsers reject. Log a one-shot warning and fall back to host-only.

Docs: correct the COOKIE_DOMAIN description (it was labelled as
CloudFront-only but applies to session cookies too) and call out the
IP-literal pitfall in SELF_HOSTING_ADVANCED.md, self-hosting.mdx, and
.env.example.

Refs #1321
2026-04-20 19:53:15 +08:00
LinYushen
07034f4455 feat(server): configurable pgxpool size with sane defaults (#1381)
* feat(server): configurable pgxpool size with sane defaults

pgxpool.New(ctx, url) silently sets MaxConns = max(4, NumCPU). On the
prod pods that resolved to 4, which got fully saturated by daemon
claim/heartbeat traffic (~3800 acquires/s) and showed up as ~900ms
acquire waits on every query — the actual root cause of the 3s+
/tasks/claim tail latency. The db pool stats logging from #1378
confirmed this with empty_acquire_delta == acquire_count_delta.

Switch to pgxpool.ParseConfig + NewWithConfig and apply per-pod
defaults of MaxConns=25 / MinConns=5, both overridable via env vars
(DATABASE_MAX_CONNS / DATABASE_MIN_CONNS) so the size can be tuned
in prod without a redeploy.

The defaults follow the standard 'small pool, lots of waiters' guidance
for Postgres (PG community / HikariCP formula
`(core_count * 2) + effective_spindle_count`); 25 leaves headroom for
bursts and occasional long queries while staying safely under typical
managed-Postgres max_connections ceilings when multiplied across pods.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(server): respect DATABASE_URL pool_* params; add precedence tests

Address review feedback on #1381:

- Configuration precedence is now explicit: DATABASE_MAX_CONNS env >
  pool_max_conns query param on DATABASE_URL > built-in default. Same
  for min_conns. Previously the env-empty path unconditionally
  overwrote whatever ParseConfig had read from the URL — a silent
  regression for deployments that already tuned pool size via the
  connection string.
- Add unit tests in dbstats_test.go covering each precedence branch
  (defaults, URL-only, env-over-URL, partial URL, invalid env,
  min>max clamp).
- Move pool tuning vars out of 'Required Variables' into a new
  'Database Pool Tuning (Optional)' section in SELF_HOSTING_ADVANCED.md
  so self-hosters don't think they need to set them.
- Add commented entries in .env.example.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(server): invalid pool env falls back to URL/code default, never pgx 4

Address second round of review on #1381:

Previous code passed cfg.MaxConns / cfg.MinConns as the envInt32 fallback,
which meant an invalid DATABASE_MAX_CONNS value silently fell back to
ParseConfig's value — i.e. pgx's built-in default of 4/0 when the URL had
no pool_* params. That's exactly the bad value this PR exists to remove,
and the previous test (TestPoolSizing_InvalidEnvFallsBack) accidentally
locked it in.

Compute the non-env fallback first (URL pool_* if present, else code
default 25/5) and pass that to envInt32. Misconfigured env now lands on
the same value as if the env were unset — never on the pgx default.

Replace the loose 'max > 0' assertion with two precise tests:
- invalid env + no URL param → code default (25/5)
- invalid env + URL param    → URL value

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-20 17:07:19 +08:00
Bohan Jiang
96ee5bba52 docs(selfhost): surface APP_ENV + 888888 gating in .env.example (#1361)
The v0.2.6 self-host security fix (#1307) defaults APP_ENV to "production"
in docker-compose.selfhost.yml, which disables the 888888 master verification
code. The follow-up docs PR (#1313) covered SELF_HOSTING.md and the
installers, but `.env.example` — the file users actually copy and edit —
still makes no mention of APP_ENV, so operators who don't read the prose
docs hit the exact same "888888 stopped working after upgrade" confusion
reported in #1331.

- Add APP_ENV= to .env.example with a comment block that spells out the three
  cases (Docker default, local dev, evaluation) and warns against enabling
  the bypass on public instances. Keeping the value empty preserves the
  current `make dev` UX (Go server reads empty → treats as non-production →
  888888 works locally) while `${APP_ENV:-production}` in the compose file
  still ensures public Docker deployments are safe by default.
- Update the existing 888888 mention under # Email so it no longer
  contradicts the new gating rule.
- Update the `make selfhost` post-start banner, which still told operators
  to "Log in with any email + verification code: 888888" even after #1307
  disabled that path by default.
2026-04-20 13:26:42 +08:00
Azaan Ali Raza
b428f36ca6 feat: add ALLOW_SIGNUP + ALLOWED_EMAIL_* for self-hosted instances (#1098)
Closes #930

- Added environment variables to control signups
- Updated frontend to hide signup text when disabled
- Added backend check to block new user creation via magic link
- Updated .env.example
2026-04-19 21:02:42 -07:00
LinYushen
5dad1f0915 fix(selfhost): clear hardcoded NEXT_PUBLIC_API_URL/WS_URL defaults (#1063)
The .env.example had hardcoded http://localhost:8080 defaults for
NEXT_PUBLIC_API_URL and NEXT_PUBLIC_WS_URL. When users copied .env.example
to .env and customized the backend port, the old defaults would still get
baked into the frontend at docker build time via NEXT_PUBLIC_WS_URL build
arg, causing API/WebSocket connection failures.

With empty defaults:
- Docker selfhost: frontend uses relative paths, Next.js rewrites proxy
  to backend internally — works regardless of external port config
- Local dev (make dev): Makefile sets these to localhost:$PORT automatically
- Browser fallback: deriveWsUrl() auto-derives WebSocket URL from page
  origin when NEXT_PUBLIC_WS_URL is empty

Closes #1055

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:56:30 +08:00
LinYushen
c0db3e0e76 Revert "feat(selfhost): add single-domain Caddy setup (#899)" (#1062)
This reverts commit 100146c49e.
2026-04-15 14:44:47 +08:00
KimSeongJun
100146c49e feat(selfhost): add single-domain Caddy setup (#899)
* selfhost: add single-domain caddy setup

* fix(selfhost): address Caddy review feedback
2026-04-14 20:20:26 -07:00
LinYushen
95bfd7dd96 feat(auth): migrate auth token to HttpOnly Cookie & WebSocket Origin whitelist (#819)
* feat(auth): migrate auth token to HttpOnly cookie & implement WebSocket Origin whitelist

Security improvements from the MUL-566 audit report:

1. Auth token is now set as an HttpOnly, SameSite=Lax cookie on login,
   preventing XSS-based token theft. Cookie-based auth includes CSRF
   protection via double-submit cookie pattern. The Authorization header
   path is preserved for Electron desktop app and CLI/PAT clients.

2. WebSocket upgrader now validates the Origin header against a
   configurable allowlist (ALLOWED_ORIGINS env var), rejecting
   connections from unauthorized origins.

Backend: new auth cookie helpers, middleware reads cookie as fallback,
WS handler accepts cookie auth, Origin whitelist, logout endpoint.
Frontend: CSRF token in API headers, cookie-aware auth store and WS
client, web app opts into cookieAuth mode while desktop keeps tokens.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(auth): address PR review — Strict cookies, HMAC-bound CSRF, origin sync

1. SameSite=Lax → SameSite=Strict per spec requirement
2. CSRF token now HMAC-signed with auth token (nonce.signature format),
   preventing subdomain cookie injection attacks
3. allowedWSOrigins uses atomic.Value to eliminate data race
4. Removed magic "cookie" sentinel string in WSProvider — pass null token
   and guard with boolean check instead
5. Removed dead delete uploadHeaders["Content-Type"] in API client

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:13:35 +08:00
Antar Das
d32c419b6d feat(storage): add local file storage fallback (#710)
* feat(storage): add local file storage fallback

- Add local storage implementation for file uploads
- Update .env.example with LOCAL_UPLOAD_DIR and LOCAL_UPLOAD_BASE_URL
- Integrate local storage into server router and handlers
- Add storage abstraction layer with util functions

* ♻️ refactor(storage): improve path handling and file serving

switch from path to filepath for better cross-platform support and replace manual file serving logic with http.ServeFile to enhance security against path traversal. update unit tests to use t.Setenv for cleaner environment variable management.
2026-04-12 14:04:22 +08:00
Jiayuan Zhang
f100b5b707 fix(auth): graceful email degradation for self-hosting (#742)
* fix(auth): log email send errors and gracefully degrade in non-production

In non-production environments (APP_ENV != "production"), if sending the
verification code email fails, log the error as a warning and still return
success. This lets self-hosting users log in with the master code (888888)
even when their Resend configuration is incomplete (e.g. unverified from-domain).

In production, the behavior is unchanged — email failures return 500.

Also adds guidance in .env.example about RESEND_FROM_EMAIL for self-hosters.

Closes #723

* fix(auth): remove APP_ENV degradation, keep error logging only

Remove the APP_ENV-based graceful degradation for email send failures
— it's risky if users forget to set APP_ENV=production. Instead, always
return 500 on email failure (safe for production) and rely on the error
log (slog.Error) with the actual Resend error for debugging.

Self-hosters who don't need real emails should leave RESEND_API_KEY empty
(codes print to stdout, master code 888888 works).
2026-04-12 02:30:01 +08:00
Jiang Bohan
14fe8e9df9 feat(auth): add Google OAuth login
Support Google login that links to existing accounts by email.
When a user who registered via email OTP signs in with Google using
the same email, they are linked to the same account.

Backend:
- Add POST /auth/google endpoint that exchanges Google auth code for
  tokens, fetches user profile, and calls findOrCreateUser()
- Updates user name and avatar from Google profile on first Google login

Frontend:
- Add "Continue with Google" button on login page (shown when
  NEXT_PUBLIC_GOOGLE_CLIENT_ID is configured)
- Add /auth/callback page to handle Google OAuth redirect
- Add loginWithGoogle to auth store and API client
2026-04-07 15:25:26 +08:00
Jiang Bohan
89bedb8f5c feat(web): support REMOTE_API_URL env for proxying to remote backend
- Load root .env in next.config.ts so REMOTE_API_URL is available
- Default fallback remains localhost:8080 (no impact on existing setups)
- Add REMOTE_API_URL to .env.example with documentation
2026-03-31 16:53:32 +08:00
yushen
29a80e057e feat(upload): add file upload API with S3 + CloudFront signed cookies
Add POST /api/upload-file endpoint that uploads files to S3 and returns
CDN URLs protected by CloudFront signed cookies (same pattern as Linear).

Infrastructure:
- Two private S3 buckets (static.multica.ai, static-staging.multica.ai)
- Two CloudFront distributions with OAC and Trusted Key Groups
- ACM wildcard cert in us-east-1, DNS records in Route 53
- RSA signing key stored in AWS Secrets Manager

Backend:
- S3 storage service with CloudFront CDN domain support
- CloudFront signed cookie generation (RSA-SHA1)
- Private key loaded from Secrets Manager (env var fallback for local dev)
- Cookies set on login (VerifyCode) with 72h expiry matching JWT
- Upload handler: multipart form → S3 → CloudFront URL response

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 14:41:17 +08:00
LinYushen
5c9c2f69fd feat(auth): email verification login and personal access tokens
* feat(auth): add email verification login flow with 401 auto-redirect

Replace the old OAuth-based login with email verification codes:
- Backend: send-code / verify-code endpoints, verification_codes table (migration 009), rate limiting, Resend email service
- Frontend: two-step login UI (email → 6-digit OTP), auth store with sendCode/verifyCode
- SDK: ApiClient gains onUnauthorized callback; 401 responses auto-clear token and redirect to /login
- Fix login button staying disabled due to global isLoading state

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(auth): add brute-force protection, redirect loop guard, and expired code cleanup

- VerifyCode: increment attempts on wrong code, reject after 5 failed tries (migration 010)
- onUnauthorized: skip redirect if already on /login to prevent infinite loops
- SendCode: best-effort cleanup of expired verification codes older than 1 hour

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(auth): add master verification code for non-production environments

Allow code "888888" to bypass email verification in non-production
environments to simplify development and testing workflows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(auth): add personal access tokens for CLI and API authentication

Add full-stack PAT support: users create tokens in Settings, CLI authenticates
via `multica auth login`. Server stores SHA-256 hashes only. Auth middleware
extended to accept both JWTs and PATs (distinguished by `mul_` prefix).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:32:30 +08:00
Jiayuan Zhang
2c28c4cba2 refactor(dev): share postgres across main and worktrees 2026-03-24 14:27:35 +08:00
Jiayuan Zhang
cdfa63af15 feat(runtime): add local codex daemon pairing 2026-03-24 12:03:14 +08:00
Jiayuan Zhang
81e64e9fce Add workspace management and isolated worktree environments 2026-03-23 18:12:11 +08:00
Jiayuan Zhang
d4f5c5b16f feat: pivot to AI-native task management platform (#232)
Replace the agent framework codebase with a new monorepo structure
for an AI-native Linear-like product where agents are first-class citizens.

New architecture:
- server/ — Go backend (Chi + gorilla/websocket + sqlc)
  - API server with REST routes for issues, agents, inbox, workspaces
  - WebSocket hub for real-time updates
  - Local daemon entry point for agent runtime connection
  - PostgreSQL migration with 13 tables (issue, agent, inbox, etc.)
  - WebSocket protocol types for server<->daemon communication
- apps/web/ — Next.js 16 frontend
  - Dashboard layout with sidebar navigation
  - Route skeleton: inbox, issues, agents, board, settings
- packages/ui/ — Preserved shadcn/ui design system (26+ components)
- packages/types/ — Full API contract types (Issue, Agent, Workspace, Inbox, Events)
- packages/sdk/ — REST ApiClient + WebSocket WSClient
- packages/store/ — Zustand stores (issue, agent, inbox, auth)
- packages/hooks/ — React hooks (useIssues, useAgents, useInbox, useRealtime)
- packages/utils/ — Shared utilities

Removed: apps/cli, apps/desktop, apps/mobile, apps/gateway,
packages/core, skills/, and all agent-framework code.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 17:55:49 +08:00
Jiayuan Zhang
005908710e chore: add local dev script for Gateway + Desktop with Telegram bot
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 00:39:43 +08:00
Jiayuan
c54e768016 docs: update credential setup 2026-02-01 02:28:43 +08:00
yushen
a9dcde124b docs: add .env.example and environment configuration guide
- Add .env.example with all supported provider env vars
  (OpenAI, Anthropic, DeepSeek, Kimi, Groq, Mistral, Together, Google)
- Update README with environment setup instructions
- Document configuration priority and startup commands
- Whitelist .env.example in .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 13:32:01 +08:00