Compare commits

...

1 Commits

Author SHA1 Message Date
J
f58995deb0 refactor(server/lark): collapse HTTP_ENABLED + WS_ENABLED into the SECRET_KEY gate (MUL-2671)
MULTICA_LARK_HTTP_ENABLED and MULTICA_LARK_WS_ENABLED were staging
knobs from the multi-PR rollout of the Lark MVP — they let the DB
schema + inbound dispatcher land before the HTTP wire was real, and
before the WS long-conn protocol was wired. Now that the MVP has
shipped end-to-end, "I set SECRET_KEY but I don't want to talk to
Lark" is not a useful production state: setting the at-rest master
key is the operator's opt-in for the integration as a whole.

Collapse the gate down to MULTICA_LARK_SECRET_KEY alone. When the
key is present, wire the real HTTPAPIClient + the real
WSLongConnConnector. CI / integration tests that want stub-style
behaviour can point MULTICA_LARK_HTTP_BASE_URL at a mock server
(already supported) instead of toggling a separate flag. Host
overrides (HTTP_BASE_URL, REGISTRATION_DOMAIN, CALLBACK_BASE_URL)
stay — those are real ops needs for international tenants / staging.

stubAPIClient + NoopConnectorFactory remain exported because the
test suite uses them directly; only the router boot path stops
reaching for them. The connector factory keeps its noop fallback
for the case where the endpoint fetcher fails to construct, so a
malformed MULTICA_LARK_CALLBACK_BASE_URL degrades gracefully
(visible as "connector=noop" in the boot log) instead of panicking
the server.

Lark integration + handler tests still pass; go vet clean.

Co-authored-by: multica-agent <github@multica.ai>
2026-06-03 19:35:35 +08:00
2 changed files with 69 additions and 93 deletions

View File

@@ -197,33 +197,24 @@ func NewRouterWithOptions(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus
h.LarkBindingTokens = lark.NewBindingTokenService(queries, pool)
slog.Info("lark integration enabled")
// APIClient selection: when MULTICA_LARK_HTTP_ENABLED is
// "true" the real Lark Open Platform HTTP client is wired
// (IM v1 send/patch + binding-prompt + bot info).
// Otherwise the stub stays in place and every outbound
// call surfaces ErrAPIClientNotConfigured — useful for
// deployments that want the inbound dispatcher / database
// surface online without committing to a Lark app yet.
// APIClient: wire the real Lark Open Platform HTTP client
// (IM v1 send/patch + binding-prompt + bot info). Setting
// MULTICA_LARK_SECRET_KEY is the operator's opt-in for
// the integration as a whole; we don't expose a separate
// "HTTP enabled" knob because the inbound dispatcher
// without outbound replies is not a useful production
// state, and CI / integration tests that want to avoid
// real Lark traffic can point MULTICA_LARK_HTTP_BASE_URL
// at a mock server.
//
// MULTICA_LARK_HTTP_BASE_URL overrides the default
// open.feishu.cn host (set to https://open.larksuite.com
// for the Lark international tenant, or to a mock for
// integration tests).
var larkClient lark.APIClient
if strings.EqualFold(strings.TrimSpace(os.Getenv("MULTICA_LARK_HTTP_ENABLED")), "true") {
larkClient = lark.NewHTTPAPIClient(lark.HTTPClientConfig{
BaseURL: strings.TrimSpace(os.Getenv("MULTICA_LARK_HTTP_BASE_URL")),
Logger: slog.Default(),
})
slog.Info("lark http api client enabled")
} else {
larkClient = lark.NewStubAPIClient(slog.Default())
}
// Expose the APIClient to handlers so the install
// surface can consult IsConfigured — install_supported
// flips true only once the real HTTP client is wired
// (the stub cannot complete the post-poll GetBotInfo
// call that finalizes a device-flow install).
larkClient := lark.NewHTTPAPIClient(lark.HTTPClientConfig{
BaseURL: strings.TrimSpace(os.Getenv("MULTICA_LARK_HTTP_BASE_URL")),
Logger: slog.Default(),
})
h.LarkAPIClient = larkClient
patcher := lark.NewPatcher(queries, installSvc, larkClient, lark.PatcherConfig{})
patcher.Register(bus)
@@ -247,26 +238,19 @@ func NewRouterWithOptions(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus
}
// WS Hub: lease + supervisor goroutines per installation.
// The factory we hand the Hub picks one of two
// connectors:
//
// - NoopConnector (default): holds the lease + sweeps
// supervisors against real DB rows without dialing
// Lark. Used on staging boxes that need the lease /
// reconnect lifecycle exercised before the live WS
// protocol is enabled, and as the safe fallback when
// the outbound HTTP APIClient is the stub (no real
// bearer means no `connection_token` call).
//
// - WSLongConnConnector (MULTICA_LARK_WS_ENABLED=true):
// real Lark long-conn over gorilla/websocket. The
// connector wraps every read with a ctx-cancel
// watchdog so lease loss / shutdown breaks the
// blocking ReadMessage in bounded time — the
// invariant §4.4 leans on. Requires the HTTP client
// to be enabled because the connection_token POST
// piggybacks on its tenant_access_token cache.
connectorFactory, connectorLabel := buildLarkConnectorFactory(larkClient, installSvc)
// The WSLongConnConnector talks Lark's long-conn protocol
// over gorilla/websocket. The connector wraps every read
// with a ctx-cancel watchdog so lease loss / shutdown
// breaks the blocking ReadMessage in bounded time — the
// invariant §4.4 leans on. If the endpoint fetcher fails
// to initialize (bad MULTICA_LARK_CALLBACK_BASE_URL or
// similar config error), buildLarkConnectorFactory logs
// and falls back to the NoopConnector so the lease /
// supervisor lifecycle still runs against real DB rows —
// inbound messages will be silently dropped until the
// config is fixed, with the boot log labelling the mode
// "noop" so operators can spot it.
connectorFactory, connectorLabel := buildLarkConnectorFactory(installSvc)
h.LarkHub = lark.NewHub(queries, connectorFactory, dispatcher, lark.HubConfig{})
// OutcomeReplier wires the outbound side of the
@@ -305,29 +289,25 @@ func NewRouterWithOptions(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus
// lark_user_binding in one DB transaction. The optional
// MULTICA_LARK_REGISTRATION_DOMAIN / _LARK_DOMAIN env
// vars override the protocol hosts for staging / dev.
if larkClient.IsConfigured() {
regCfg := lark.RegistrationConfig{
Domain: strings.TrimSpace(os.Getenv("MULTICA_LARK_REGISTRATION_DOMAIN")),
LarkDomain: strings.TrimSpace(os.Getenv("MULTICA_LARK_REGISTRATION_LARK_DOMAIN")),
}
regClient := lark.NewRegistrationClient(regCfg)
regSvc, rerr := lark.NewRegistrationService(
lark.RegistrationServiceConfig{Logger: slog.Default()},
regClient,
larkClient,
queries,
pool,
installSvc,
h.LarkBindingTokens,
)
if rerr != nil {
slog.Error("lark: RegistrationService init failed; install disabled", "error", rerr)
} else {
h.LarkRegistration = regSvc
slog.Info("lark device-flow install enabled")
}
regCfg := lark.RegistrationConfig{
Domain: strings.TrimSpace(os.Getenv("MULTICA_LARK_REGISTRATION_DOMAIN")),
LarkDomain: strings.TrimSpace(os.Getenv("MULTICA_LARK_REGISTRATION_LARK_DOMAIN")),
}
regClient := lark.NewRegistrationClient(regCfg)
regSvc, rerr := lark.NewRegistrationService(
lark.RegistrationServiceConfig{Logger: slog.Default()},
regClient,
larkClient,
queries,
pool,
installSvc,
h.LarkBindingTokens,
)
if rerr != nil {
slog.Error("lark: RegistrationService init failed; install disabled", "error", rerr)
} else {
slog.Info("lark device-flow install disabled (set MULTICA_LARK_HTTP_ENABLED=true to enable)")
h.LarkRegistration = regSvc
slog.Info("lark device-flow install enabled")
}
}
}
@@ -948,23 +928,22 @@ func NewRouterWithOptions(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus
return r, h
}
// buildLarkConnectorFactory picks between the staging-mode
// NoopConnector and the real WS long-conn connector based on
// MULTICA_LARK_WS_ENABLED. The real connector talks to
// /callback/ws/endpoint directly with app_id/app_secret (no
// tenant_access_token bearer needed for the bootstrap), so it can
// run independently of MULTICA_LARK_HTTP_ENABLED — but outbound
// reply cards (binding prompt, offline / archived notices) require
// the HTTP APIClient. We still recommend enabling both together;
// when only WS is on, the OutcomeReplier surfaces a warning and
// downgrades to silent drops.
// buildLarkConnectorFactory wires the real WS long-conn connector
// that talks to /callback/ws/endpoint directly with app_id/app_secret.
// The connector wraps every read with a ctx-cancel watchdog so lease
// loss / shutdown breaks the blocking ReadMessage in bounded time —
// the invariant §4.4 leans on.
//
// Returns the factory plus a short label for the boot log so
// operators can see at a glance which mode they are in.
func buildLarkConnectorFactory(client lark.APIClient, installSvc *lark.InstallationService) (lark.ConnectorFactory, string) {
if !strings.EqualFold(strings.TrimSpace(os.Getenv("MULTICA_LARK_WS_ENABLED")), "true") {
return lark.NoopConnectorFactory(slog.Default()), "noop"
}
// If the endpoint fetcher fails to initialize (typically a malformed
// MULTICA_LARK_CALLBACK_BASE_URL), we log and fall back to the
// NoopConnector so the lease / supervisor lifecycle still exercises
// against real DB rows. Inbound messages are silently dropped until
// the config is fixed; the boot log labels the mode "noop" so the
// degraded state is visible.
//
// Returns the factory plus a short label for the boot log: "ws" in
// the healthy case, "noop" in the fallback case.
func buildLarkConnectorFactory(installSvc *lark.InstallationService) (lark.ConnectorFactory, string) {
endpointFetcher, err := lark.NewHTTPConnectionTokenFetcher(lark.HTTPConnectionTokenConfig{
BaseURL: strings.TrimSpace(os.Getenv("MULTICA_LARK_CALLBACK_BASE_URL")),
Logger: slog.Default(),
@@ -1000,7 +979,6 @@ func buildLarkConnectorFactory(client lark.APIClient, installSvc *lark.Installat
slog.Error("lark ws: connector init failed; falling back to noop", "error", err)
return lark.NoopConnectorFactory(slog.Default()), "noop"
}
_ = client // outbound APIClient flows in through OutcomeReplier wiring on the Hub
return func(_ db.LarkInstallation) (lark.EventConnector, error) {
return conn, nil
}, "ws-long-conn"

View File

@@ -130,23 +130,21 @@ type Handler struct {
// LarkRegistration owns the device-flow install lifecycle: begin
// a registration session against accounts.feishu.cn, poll, and
// on success write lark_installation + the installer's
// lark_user_binding in one DB transaction. Nil when either the
// at-rest key is unset or the real Lark HTTP APIClient is not
// wired (the stub cannot complete the post-poll GetBotInfo call).
// lark_user_binding in one DB transaction. Nil when the at-rest
// key is unset or the RegistrationService failed to construct at
// boot.
LarkRegistration *lark.RegistrationService
// LarkAPIClient is the live transport that backs SendInteractiveCard,
// PatchInteractiveCard, SendBindingPromptCard, GetBotInfo. It is
// `lark.NewStubAPIClient(...)` until the real Lark HTTP client is
// wired; the UI hides install entry points while IsConfigured()
// is false so users do not land in a flow that is guaranteed to
// fail at the bot-info step.
// PatchInteractiveCard, SendBindingPromptCard, GetBotInfo. The
// router wires the real Lark HTTP client whenever
// MULTICA_LARK_SECRET_KEY is set; tests that need a no-op
// behaviour can swap in `lark.NewStubAPIClient(...)` directly. The
// UI consults IsConfigured() to decide whether to surface install
// entry points.
LarkAPIClient lark.APIClient
// LarkHub owns the per-installation supervisor goroutines that
// hold the §4.4 WS lease and run the EventConnector. Nil only
// when the master at-rest key (MULTICA_LARK_SECRET_KEY) is unset
// — the inbound pipeline does NOT depend on the outbound HTTP
// APIClient, so the Hub still wires up under the stub APIClient
// (the dispatcher and renewer touch DB rows, not Lark wire I/O).
// when the master at-rest key (MULTICA_LARK_SECRET_KEY) is unset.
// The router constructs the Hub but does NOT call Run on it; the
// process owner (main.go) starts it under a long-running context
// and joins via WaitWithTimeout (bounded wait, fenced by