Files
multica/server/internal/integrations/lark/installation.go
Bohan Jiang 6ac8314711 feat(lark): support both Feishu and Lark from one deployment (MUL-3083) (#3815)
* feat(lark): serve Feishu and Lark from one deployment, per installation

The Lark integration was locked to a single open-platform host chosen
deployment-wide (MULTICA_LARK_HTTP_BASE_URL / _CALLBACK_BASE_URL,
defaulting to open.feishu.cn), so one deployment could talk to only the
mainland Feishu cloud OR Lark international — never both. Teams on the
other tenant could not use the integration at all.

Make the host per-installation. The device-flow installer already
auto-detects the tenant (Lark emits tenant_brand="lark" mid-poll); we now
persist that as lark_installation.region, carry it on
InstallationCredentials.Region, and resolve the open-platform host per
call (REST + WS bootstrap) from the region. An explicit cfg.BaseURL
(env / httptest) still overrides every region, so existing tests and
staging/proxy setups keep working.

- migration 116: lark_installation.region TEXT NOT NULL DEFAULT 'feishu'
  CHECK (region IN ('feishu','lark')) — existing rows are all mainland.
- lark.Region enum + OpenPlatformBaseURL/RegionOrDefault helpers.
- registration: thread the detected region into finishSuccess so the
  install-time GetBotInfo hits the right cloud AND the row records it.
- every credential-build site (patcher, replier, WS provider, union_id
  backfill) copies region off the installation row.
- region is part of the WS supervisor fingerprint so a re-install that
  switches cloud restarts the connection.
- API: surface region on the installation listing DTO.

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): surface installation region in settings UI

Read the per-installation region off the listings response: build the
"Manage in Lark" dev-console host from it (open.feishu.cn vs
open.larksuite.com instead of a hardcoded mainland host) and render a
Feishu / Lark badge on each connected bot. The field is optional and
defaults to Feishu when an older server omits it (API-compat). Adds the
region_feishu / region_lark labels to all four locales.

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* docs(lark): document simultaneous Feishu + Lark support

The cloud each bot belongs to is now auto-detected at install and stored
per installation, so one deployment serves both. Replace the old
"point MULTICA_LARK_HTTP_BASE_URL at larksuite for international tenants"
guidance (now just an optional override) in all four locales.

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* fix(lark): repair legacy Lark-international installs on upgrade

Review follow-up (MUL-3083). Migration 116 backfilled every existing
lark_installation to region='feishu', assuming all historical rows were
mainland. But self-host deployments could already run Lark international
via the deployment-wide MULTICA_LARK_HTTP_BASE_URL override, so those
rows are really Lark — clearing the override after upgrade (which the new
docs invite) would route them to open.feishu.cn and break them.

Add a one-shot startup repair, BackfillRegionFromLegacyOverride, fired
off the hot path like BackfillBotUnionIDs: when the deployment's global
base-URL override targets open.larksuite.com, relabel the still-default
'feishu' rows to 'lark'. Gating on the deployment-wide override is what
makes it safe — every pre-existing install on such a deployment was Lark.
Idempotent; no-op on mainland / fresh deployments. Verified end-to-end
against a scratch DB (flip then 0-row idempotent re-run).

Also document that a Lark/飞书 app_id is globally unique across both
clouds, which is what makes the app_id-keyed token cache and the
UNIQUE(app_id) constraint safe across regions (review nit).

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

* docs(lark): fix ops guidance to match auto per-installation region

Review follow-up (MUL-3083). .env.example and docker-compose.selfhost.yml
still told operators that international Lark requires pointing both base
URLs at open.larksuite.com — now wrong, and it would push a fresh
deployment back into a single-cloud override. Rewrite them: the base
URLs are optional deployment-wide overrides; normal dual-cloud operation
keeps them empty. Document the first-boot auto-relabel for deployments
migrating off the old single-cloud override, across the integration docs
(en/zh/ja/ko).

MUL-3083

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-05 16:03:13 +08:00

160 lines
6.2 KiB
Go

package lark
import (
"context"
"errors"
"fmt"
"github.com/jackc/pgx/v5"
"github.com/jackc/pgx/v5/pgtype"
"github.com/multica-ai/multica/server/internal/util/secretbox"
db "github.com/multica-ai/multica/server/pkg/db/generated"
)
// InstallationParams is the input shape RegistrationService assembles
// after a successful device-flow scan-to-install. The credentials are
// supplied here as plaintext — encryption happens inside
// InstallationService.Upsert via the supplied *secretbox.Box, so
// callers never see (and therefore cannot leak) the ciphertext that
// lands in the DB.
type InstallationParams struct {
WorkspaceID pgtype.UUID
AgentID pgtype.UUID
AppID string
AppSecret string // plaintext; encrypted at the service boundary
TenantKey string // optional, "" treated as NULL
BotOpenID string
InstallerUserID pgtype.UUID
Region Region // which cloud (feishu/lark); empty defaults to feishu
}
// InstallationService creates, refreshes and revokes per-agent Lark
// installations. It owns the at-rest encryption of `app_secret` so
// that no caller (and no test fixture) can accidentally insert a row
// with plaintext credentials — the only path to writing
// lark_installation goes through here.
type InstallationService struct {
queries *db.Queries
box *secretbox.Box
}
// NewInstallationService binds the service to a queries handle and a
// secretbox keyed for at-rest encryption. The box MUST be non-nil; we
// refuse to fall back to plaintext storage even in test or dev
// configurations because that is exactly the regression the §4.4
// requirement guards against.
func NewInstallationService(queries *db.Queries, box *secretbox.Box) (*InstallationService, error) {
if box == nil {
return nil, errors.New("lark: InstallationService requires a non-nil secretbox.Box")
}
return &InstallationService{queries: queries, box: box}, nil
}
// Upsert creates a new installation or refreshes an existing one in
// place (matching on the (workspace_id, agent_id) UNIQUE). Re-install
// resets status to 'active' but does NOT touch the WS lease — that is
// the hub's concern, not ours. The returned row is the post-write
// state; the encrypted secret column is included for completeness but
// callers SHOULD NOT log or persist it elsewhere.
func (s *InstallationService) Upsert(ctx context.Context, p InstallationParams) (db.LarkInstallation, error) {
if err := validateInstallationParams(p); err != nil {
return db.LarkInstallation{}, err
}
sealed, err := s.box.Seal([]byte(p.AppSecret))
if err != nil {
return db.LarkInstallation{}, fmt.Errorf("encrypt app_secret: %w", err)
}
return s.queries.UpsertLarkInstallation(ctx, db.UpsertLarkInstallationParams{
WorkspaceID: p.WorkspaceID,
AgentID: p.AgentID,
AppID: p.AppID,
AppSecretEncrypted: sealed,
TenantKey: textOrNull(p.TenantKey),
BotOpenID: p.BotOpenID,
InstallerUserID: p.InstallerUserID,
Region: string(RegionOrDefault(string(p.Region))),
})
}
// Revoke flips status to 'revoked' so the WS hub tears the connection
// down on its next sweep and the dispatcher drops any in-flight
// events. The row is preserved (no DELETE) so audit history remains
// queryable; a subsequent re-install via Upsert flips status back to
// 'active' atomically.
func (s *InstallationService) Revoke(ctx context.Context, id pgtype.UUID) error {
return s.queries.SetLarkInstallationStatus(ctx, db.SetLarkInstallationStatusParams{
ID: id,
Status: string(InstallationRevoked),
})
}
// DecryptAppSecret returns the plaintext app_secret for the supplied
// installation row. Used by the WebSocket hub when it needs to
// authenticate against the Lark API on behalf of an installation; do
// NOT use this for read-only display surfaces. The plaintext value
// must never round-trip through an HTTP response.
func (s *InstallationService) DecryptAppSecret(inst db.LarkInstallation) (string, error) {
plain, err := s.box.Open(inst.AppSecretEncrypted)
if err != nil {
return "", fmt.Errorf("decrypt app_secret: %w", err)
}
return string(plain), nil
}
// GetInWorkspace is the workspace-scoped lookup helper. Internal
// callers (Dispatcher) use GetLarkInstallationByAppID directly because
// the event payload only carries app_id; HTTP-side callers always
// know the workspace and should use this so a forged installation_id
// from a different workspace returns NotFound instead of leaking
// existence.
func (s *InstallationService) GetInWorkspace(ctx context.Context, id, workspaceID pgtype.UUID) (db.LarkInstallation, error) {
row, err := s.queries.GetLarkInstallationInWorkspace(ctx, db.GetLarkInstallationInWorkspaceParams{
ID: id,
WorkspaceID: workspaceID,
})
if err != nil {
if errors.Is(err, pgx.ErrNoRows) {
return db.LarkInstallation{}, ErrInstallationNotFound
}
return db.LarkInstallation{}, err
}
return row, nil
}
// ListByWorkspace returns every installation rooted at the workspace,
// active and revoked, oldest first. The status column lets the UI
// distinguish "wired up" from "torn down but kept for audit".
func (s *InstallationService) ListByWorkspace(ctx context.Context, workspaceID pgtype.UUID) ([]db.LarkInstallation, error) {
return s.queries.ListLarkInstallationsByWorkspace(ctx, workspaceID)
}
// ErrInstallationNotFound surfaces "no row matches in this workspace"
// — used by the HTTP layer to return 404. Distinct from a plain
// pgx.ErrNoRows so handlers do not need to import pgx.
var ErrInstallationNotFound = errors.New("lark installation not found")
func validateInstallationParams(p InstallationParams) error {
switch {
case !p.WorkspaceID.Valid:
return errors.New("workspace_id is required")
case !p.AgentID.Valid:
return errors.New("agent_id is required")
case !p.InstallerUserID.Valid:
return errors.New("installer_user_id is required")
case p.AppID == "":
return errors.New("app_id is required")
case p.AppSecret == "":
return errors.New("app_secret is required")
case p.BotOpenID == "":
return errors.New("bot_open_id is required")
}
return nil
}
func textOrNull(s string) pgtype.Text {
if s == "" {
return pgtype.Text{}
}
return pgtype.Text{String: s, Valid: true}
}