Compare commits

...

5 Commits

Author SHA1 Message Date
yushen
fdf44a453b Merge remote-tracking branch 'origin/main' into feat/cloud-billing-proxy 2026-05-28 15:58:36 +08:00
yushen
afb199d7bf fix(billing): block mcn_ cloud-node PAT actors too — same kind as mat_
Same-shaped vulnerability the mat_ commit (86d309dbb) just fixed,
caught in re-review: mcn_ cloud-node PATs are also a machine
credential, and they were sailing through RequireHumanActor because
the gate only knew about 'task_token'.

The mcn_ branch in middleware/auth.go was setting X-User-ID from the
verifier's owner_id but not stamping any X-Actor-Source. The
RequireHumanActor gate added in 86d309dbb did:

    if r.Header.Get("X-Actor-Source") == "task_token" { 403 }

so an mcn_ caller looked indistinguishable from a JWT/mul_ caller —
which is exactly the lateral-movement risk the previous commit was
supposed to close. A compromised cloud-runtime EC2 node could read
its owner's billing balance / transactions / batches / topups, and
create checkout / portal sessions, despite us thinking we'd shut
agents/nodes out.

Three changes:

1. middleware/auth.go mcn_ branch now stamps
   X-Actor-Source=cloud_pat alongside X-User-ID. Symmetric with the
   existing 'task_token' stamp in the mat_ branch — server-set,
   stripped from any client-supplied value at the top of the
   middleware, no way for the cloud node to forge or omit.

2. handler/actor_guards.go RequireHumanActor switched from a single-
   value '== task_token' check to an explicit switch covering both
   'task_token' and 'cloud_pat'. Comments were rewritten to call
   out the two-machine-credentials picture explicitly, plus a
   maintenance note that any new machine-credential auth branch
   added later MUST stamp a distinct value AND get reviewed against
   this gate at the same time.

3. middleware/daemon_auth.go was updated for symmetry — strips
   client-supplied X-Actor-Source at the top (Auth has done this
   since MUL-2600; DaemonAuth was missing it) and stamps
   X-Actor-Source=cloud_pat in the mcn_ branch. This doesn't
   directly affect billing (billing routes do not go through
   DaemonAuth), but keeps the two middlewares' contracts uniform
   so an endpoint that ever sits behind both, or moves between
   them, can't be tricked by which auth chain happened to handle
   the request.

Tests:

  * RequireHumanActor — old TestRequireHumanActor_BlocksTaskToken
    folded into a table-driven TestRequireHumanActor_BlocksMachineCredentials
    that walks both 'task_token' and 'cloud_pat'. The table comment
    explicitly says: 'a new machine credential added without a
    corresponding case here would silently grant agents/nodes
    account-level access' — keeps future review honest.
  * Auth mcn_ happy-path test now also asserts X-Actor-Source=cloud_pat
    is stamped after a successful Fleet verify.
  * DaemonAuth mcn_ happy-path test mirrors the same assertion.
  * New TestDaemonAuth_StripsClientSuppliedActorSource pins the new
    top-of-middleware strip so a future regression on it is caught.

go build / go vet / go test ./... clean.
2026-05-28 15:51:32 +08:00
yushen
86d309dbb8 fix(billing): block mat_ task-token actors from /api/cloud-billing/*
Adds handler.RequireHumanActor — a chi middleware that 403s any
request carrying X-Actor-Source=task_token — and applies it to the
cloud-billing route group.

Why this matters (the bug it fixes):

The general Auth middleware (server/internal/middleware/auth.go)
turns every recognized bearer format into the same shape: a stamped
X-User-ID header. JWT cookie, mul_ PAT, AND mat_ task token all
produce a non-empty X-User-ID. That uniformity is correct for issue
/ comment / chat scopes — agents acting as their owner inside a
bounded (agent, task) job is exactly what mat_ is designed for.

It is NOT correct for account-level scopes:

  * Reading balance / transactions / batches / topups list lets an
    agent see its owner's wallet state without the owner approving
    the query.

  * Creating checkout / portal sessions can move money or leak
    payment-method state. A compromised agent (prompt-injected, bad
    MCP tool, escaped quote in scratch data) could spin up a checkout
    bound to an attacker-controlled email or open a Billing Portal
    session.

Before this commit, /api/cloud-billing/* sat outside the workspace
group with only the generic Auth chain, so an mat_ task token's
stamped X-User-ID was indistinguishable from an mul_ PAT's. The
proxy happily forwarded everything to multica-cloud, where the
upstream's owner-only contract was bypassed entirely.

The fix:

  * RequireHumanActor middleware checks the SERVER-SET (not client-
    settable) X-Actor-Source header. The Auth middleware deletes any
    client-supplied value before stamping its own, so a non-empty
    'task_token' value here is authoritative — a member-token caller
    cannot forge it, and an mat_-token caller cannot strip it.

  * Wired in via r.Use(handler.RequireHumanActor) on the
    /api/cloud-billing route group. Applying it as middleware (not
    inline in each handler) means a developer adding a new billing
    endpoint cannot accidentally skip the gate.

  * Deliberately denylist-shaped, not allowlist. Future actor kinds
    added to auth.go (e.g. service-account tokens) get a conscious
    decision-point at the gate rather than being pre-emptively shut
    out by an over-broad rule today.

The Stripe webhook is untouched — it sits outside the entire Auth
group, so X-Actor-Source is never stamped on it (the auth middleware
doesn't run). The guard is irrelevant on that path.

Tests:

  * TestRequireHumanActor_AllowsHumanRequest — empty actor source
    passes through (JWT / mul_ PAT path).
  * TestRequireHumanActor_BlocksTaskToken — 'task_token' value
    triggers 403 and inner handler is not called.
  * TestRequireHumanActor_IgnoresUnknownActorSource — pins the
    denylist shape with explicit comment so the next person adding
    an actor kind knows where to revisit.
  * TestRequireHumanActor_AppliedViaChiRouterUse — small but
    important: builds a real chi router with r.Use(...) and proves
    the guard fires on every endpoint in the group, which is the
    contract router.go relies on.

go build / go vet / go test ./... clean.
2026-05-28 15:40:28 +08:00
yushen
b604c6d9b0 fix(billing): self-review fixes — workspace scope, rate limit, signature gate
Addresses review feedback on the initial /api/cloud-billing proxy.

(P2) Move /api/cloud-billing/* out of the workspace-scoped group.
Billing is account-level — the upstream owner model is single-user
per X-User-ID and has no workspace concept. Routes were nested under
RequireWorkspaceMember, so any caller without an active workspace
slug got a 400/404 from the membership middleware before the proxy
could run. Now sits next to /api/me / /api/tokens in the user-scoped
section, still inside the Auth group but free of the workspace
membership check.

Stripe webhook hardening:

  * Per-IP rate limit (reuses h.WebhookIPRateLimiter, same limiter
    HandleAutopilotWebhook uses). The endpoint is public; without
    rate limiting an attacker can flood it and force one upstream
    round-trip per request, burning the cloud's webhook handling
    budget. Fast-path 429 stops the bleed before we touch the body.

  * Early 401 on missing Stripe-Signature. Real Stripe deliveries
    always include the header; absence is a confident not-from-Stripe
    signal. Local 401 saves the upstream RTT on the most common
    illegitimate poke (curl from a probe). It is a strict superset of
    what the upstream would do (cloud also rejects missing-signature
    with 401), so Stripe's delivery dashboard sees the same outcome
    that would have come from the cloud.

  * Forward Stripe's original Content-Type. cloudruntime's default
    overrode it to plain application/json, dropping the
    '; charset=utf-8' suffix Stripe always sends. Signature is
    over the body so it didn't break verification, but preserving
    the exact header is cheap and removes a debug-time surprise.

Code-quality cleanups:

  * Allow-list (not deny-list) on Stripe session_id path param. Old
    check rejected /, ?, # only; allow-list rejects anything outside
    [A-Za-z0-9_]. A future Stripe ID format bump cannot quietly slip
    a new harmful character past us — it gets rejected until we
    consciously widen the rule.

  * Cache http.CanonicalHeaderKey result in cloudruntime.Client.Do
    (was called twice per iteration).

  * gofmt the test file (struct field alignment + stray semicolon).

Tests updated:

  * Existing happy-path test now also asserts upstream Content-Type
    matches the request's '; charset=utf-8' value.
  * TestStripeWebhookMissingSignatureStillForwards →
    TestStripeWebhookMissingSignatureRejectedLocally (asserts 401
    AND that upstream is NOT called).
  * Empty-body test now sets Stripe-Signature header (otherwise the
    new pre-check 401s before the empty-body branch can run).
  * New TestStripeWebhookRateLimited drives the 429 fast-path with
    a denying limiter stub.

go build / go vet / go test ./... clean.
2026-05-28 15:23:32 +08:00
yushen
6cd03f24a0 feat(billing): proxy /api/cloud-billing/* + Stripe webhook to multica-cloud
Mirrors the existing /api/cloud-runtime/* Fleet proxy pattern. Eight
user-authenticated endpoints under /api/cloud-billing/ stamp X-User-ID
from the auth context and forward to /api/v1/billing/* on multica-cloud
(same upstream service / port the cloud-runtime proxy already targets,
per the upstream API doc):

  GET    /api/cloud-billing/balance
  GET    /api/cloud-billing/transactions
  GET    /api/cloud-billing/batches
  GET    /api/cloud-billing/topups
  GET    /api/cloud-billing/price-tiers
  POST   /api/cloud-billing/checkout-sessions
  GET    /api/cloud-billing/checkout-sessions/{sessionId}
  POST   /api/cloud-billing/portal-sessions

Stripe webhook is the outlier and lives outside the Auth group at
POST /api/webhooks/stripe (public ingress, alongside the autopilot
and GitHub webhooks). Three things matter for it and are NOT shared
with the JSON proxy:

  1. No application-layer auth — Stripe signs the raw body with a
     shared secret, the cloud upstream verifies. Adding any
     pre-check here would only weaken the contract.
  2. Body forwarded byte-for-byte. The signature is computed over
     the exact bytes; a json.Unmarshal/Marshal round-trip silently
     breaks verification.
  3. Stripe-Signature header is forwarded verbatim. To support that,
     cloudruntime.Request gains an optional Headers http.Header
     field, applied AFTER the client's defaults (Accept,
     Content-Type) but BEFORE the authoritative X-User-ID /
     X-Request-ID stamps so a caller can't smuggle either of those.

A 1 MiB MaxBytesReader caps webhook bodies so a flood can't make us
buffer arbitrary memory before the upstream rejects on signature.

Frontend integration is out of scope for this PR — the routes are
just plumbing. Per-deployment behavior:

  - SaaS API node with MULTICA_CLOUD_FLEET_URL set: routes work
    end-to-end against the cloud Billing module.
  - Self-hosted (no fleet URL): all proxy routes return 503 'cloud
    runtime is not configured', which is the same fail-soft already
    used for /api/cloud-runtime/*.

Tests in cloud_billing_test.go cover:
  * table-driven happy path for every standard endpoint (method,
    path, user_id stamp, query forwarding for GET-with-query, body
    forwarding for POST)
  * the dynamic-path checkout-session GET (param appended verbatim)
  * path-traversal rejection on session_id (/, ?, # are unsafe)
  * missing-param 400
  * disabled-cloud 503
  * webhook raw-body byte-perfect forwarding (body deliberately has
    leading whitespace + trailing newline + non-canonical key order
    to catch any unmarshal/marshal round-trip)
  * webhook missing Stripe-Signature still forwards (cloud is the
    one that rejects)
  * empty webhook body forwards
  * oversize webhook body returns 413 without calling upstream
  * webhook on a self-hosted deployment returns 503

go build / go vet / go test ./... clean.
2026-05-28 15:01:39 +08:00
10 changed files with 1175 additions and 2 deletions

View File

@@ -275,6 +275,11 @@ func NewRouterWithOptions(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus
// HMAC-SHA256 signature in the handler) and post-install setup callback.
r.Post("/api/webhooks/github", h.HandleGitHubWebhook)
r.Get("/api/github/setup", h.GitHubSetupCallback)
// Stripe webhook (no Multica auth — Stripe signs the raw body
// with a shared secret, the multica-cloud upstream verifies. We
// only forward the bytes + the Stripe-Signature header; see
// HandleCloudBillingStripeWebhook for the rationale).
r.Post("/api/webhooks/stripe", h.HandleCloudBillingStripeWebhook)
// Daemon API routes (require daemon token or valid user token)
r.Route("/api/daemon", func(r chi.Router) {
@@ -391,6 +396,43 @@ func NewRouterWithOptions(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus
r.Delete("/{id}", h.RevokePersonalAccessToken)
})
// Cloud Billing proxy. Same upstream service / port as
// cloud-runtime — multica-cloud's Fleet and Billing share
// :8080 and the same chi router. All routes here forward
// to /api/v1/billing/* with X-User-ID stamped from the
// authenticated context.
//
// User-scoped (account-level), NOT workspace-scoped — sits
// outside the RequireWorkspaceMember group so a user can
// inspect their balance, top up, and open the Billing Portal
// without an active workspace selected. The upstream owner
// model is single-user; X-Workspace-ID would be ignored even
// if we sent it. The Stripe webhook is the public outlier
// and lives outside the entire Auth group (see above).
//
// IMPORTANT — task-token actors are blocked here. The Auth
// middleware happily turns an mat_ task token into a normal
// X-User-ID stamp (so agents can comment, claim issues, etc.
// as their owner), but billing is account-level and a running
// agent reading its owner's balance / opening a checkout
// session is the kind of lateral-movement we're explicitly
// trying to prevent. handler.RequireHumanActor checks the
// authoritative server-set X-Actor-Source header and 403s
// any task-token request. See actor_guards.go for the full
// rationale.
r.Route("/api/cloud-billing", func(r chi.Router) {
r.Use(handler.RequireHumanActor)
r.Get("/balance", h.GetCloudBillingBalance)
r.Get("/transactions", h.ListCloudBillingTransactions)
r.Get("/batches", h.ListCloudBillingBatches)
r.Get("/topups", h.ListCloudBillingTopups)
r.Get("/price-tiers", h.ListCloudBillingPriceTiers)
r.Post("/checkout-sessions", h.CreateCloudBillingCheckoutSession)
r.Get("/checkout-sessions/{sessionId}", h.GetCloudBillingCheckoutSession)
r.Post("/portal-sessions", h.CreateCloudBillingPortalSession)
})
// --- Workspace-scoped routes (all require workspace membership) ---
r.Group(func(r chi.Router) {
r.Use(middleware.RequireWorkspaceMember(queries))

View File

@@ -35,6 +35,18 @@ type Request struct {
Body []byte
UserID string
RequestID string
// Headers carries arbitrary outbound headers that the caller wants
// forwarded verbatim. They are applied AFTER the client's defaults
// (Accept, Content-Type, X-User-ID, X-Request-ID) so a caller
// supplying any of those overrides them — useful when proxying a
// request whose Content-Type is not application/json or whose
// signed body must not be touched (e.g. the Stripe webhook
// passthrough preserving Stripe-Signature alongside the raw body).
//
// Nil / empty is the common case; existing call sites can ignore
// this field.
Headers http.Header
}
type Response struct {
@@ -96,6 +108,25 @@ func (c *Client) Do(ctx context.Context, req Request) (*Response, error) {
if len(req.Body) > 0 {
httpReq.Header.Set("Content-Type", "application/json")
}
// Caller-supplied headers go before the X-User-ID / X-Request-ID
// stamps, since those are derived from authenticated context and
// must not be overridable by the caller. Defaults (Accept,
// Content-Type) ARE overridable — a webhook passthrough may need
// to preserve a non-JSON Content-Type.
for k, vs := range req.Headers {
// Skip the headers we stamp authoritatively below. Canonicalize
// once per key — http.CanonicalHeaderKey allocates on its
// fast path so calling it twice per iteration would double
// the per-request header overhead for no reason.
canon := http.CanonicalHeaderKey(k)
if canon == "X-User-Id" || canon == "X-Request-Id" {
continue
}
httpReq.Header.Del(k)
for _, v := range vs {
httpReq.Header.Add(k, v)
}
}
if req.UserID != "" {
httpReq.Header.Set("X-User-ID", req.UserID)
}

View File

@@ -0,0 +1,108 @@
package handler
import (
"net/http"
)
// RequireHumanActor is a chi-style middleware that rejects requests
// authenticated via a machine credential — currently mat_ task tokens
// and mcn_ cloud-node PATs. It exists for endpoints whose
// authorization model is "the human owner authorized this", not
// "anyone holding the owner's credentials authorized this".
//
// Why this guard is needed (read carefully — auth here is subtle):
//
// The general Auth middleware (server/internal/middleware/auth.go)
// turns four different bearer formats into the same shape — a stamped
// `X-User-ID` header — so downstream handlers don't have to care which
// token kind the caller used:
//
// - JWT cookie / mul_ PAT → X-User-ID = the human's user id.
// X-Actor-Source is left empty.
// - mat_ task token → X-User-ID = the OWNING human's user id,
// plus X-Agent-ID, X-Task-ID, and the
// authoritative server-set header
// `X-Actor-Source: task_token`.
// - mcn_ cloud-node PAT → X-User-ID = the OWNING human's user id,
// plus `X-Actor-Source: cloud_pat`.
// The token authenticates a cloud-runtime
// EC2 node operating on the owner's
// behalf — same conceptual category as
// mat_ (machine running owner-scoped
// code) for authorization purposes.
//
// The mat_ and mcn_ designs (MUL-2600 and the cloud-node PAT story
// respectively) were both deliberately built this way: every request
// the agent / node makes is treated as the owner's, so they can
// post comments, claim issues, register runtimes, etc., as if the
// owner had done it. That is correct for issue / comment / chat
// scopes — those are bounded by workspace membership and by the
// task or runtime binding.
//
// It is NOT correct for account-level scopes:
//
// - Billing balance / transactions / batches / topups list
// are user-scoped. A running agent or a cloud node could read
// its owner's wallet state without the owner ever having
// approved a billing query.
//
// - Checkout / portal session creation can move money. A machine
// credential that gets compromised — by a prompt injection, a
// bad MCP tool, an escaped quote in scratch data, or a node
// escape — could spin up a checkout for an attacker-controlled
// email or open a Billing Portal session that leaks subscription
// / payment-method state.
//
// `X-Actor-Source` is server-set only. The Auth middleware deletes any
// client-supplied value first (see auth.go: `r.Header.Del("X-Actor-Source")`),
// then re-sets it ONLY on the mat_ and mcn_ branches. So checking
// this header is the safe, fast, single-source-of-truth way to know
// "is the request from a machine credential?" — without re-querying
// the token table.
//
// We deliberately do NOT use h.resolveActor() here:
//
// - resolveActor's primary job is "agent vs member" classification
// for ownership / authorship attribution (issue creator, comment
// author, etc.). It also has a fallback path that trusts
// X-Agent-ID + X-Task-ID for legacy CLI flows; that fallback is
// valid for resolving authorship but is irrelevant here. Billing
// authorization needs the strict "machine credential → forbidden"
// gate, nothing else.
// - resolveActor takes a workspaceID parameter; billing routes have
// no workspace context, so threading one through just to call it
// would be misleading.
// - resolveActor doesn't currently classify mcn_ cloud-node PATs
// because cloud nodes don't act on workspace-scoped resources
// where author attribution matters. Bolting that classification
// into resolveActor solely to reuse it here would be the wrong
// coupling.
//
// Apply via `r.Use(handler.RequireHumanActor)` on a chi route group.
// The middleware is intentionally NOT wired in via the router's main
// Auth chain — the default contract elsewhere (issues, chat, etc.) is
// "agent and human are interchangeable", and adding a global gate
// would break legitimate agent traffic. Only attach it where the
// scope is truly human-only.
//
// To extend: any new machine-credential auth branch added to
// auth.go (e.g. a hypothetical service-account token) MUST stamp a
// distinct X-Actor-Source value AND get reviewed against this gate
// at the same time. The denylist below is intentionally explicit —
// silently passing an unknown actor source is a feature, not a bug
// (see TestRequireHumanActor_IgnoresUnknownActorSource), but the
// addition of a new value is the moment to decide whether it's
// human-equivalent or machine-equivalent.
func RequireHumanActor(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// X-Actor-Source is server-set only. The auth middleware
// strips any client-supplied value before stamping its own,
// so a non-empty value here is authoritative.
switch r.Header.Get("X-Actor-Source") {
case "task_token", "cloud_pat":
writeError(w, http.StatusForbidden, "this endpoint is only available to human actors")
return
}
next.ServeHTTP(w, r)
})
}

View File

@@ -0,0 +1,162 @@
package handler
import (
"net/http"
"net/http/httptest"
"testing"
"github.com/go-chi/chi/v5"
)
// TestRequireHumanActor_AllowsHumanRequest pins the happy path: a
// request that passed Auth as a JWT or mul_ PAT does NOT carry
// X-Actor-Source, so the guard lets it through and the inner handler
// runs.
//
// We construct a bare http.Handler chain (no full router) so the test
// exercises only the middleware logic and is independent of any
// downstream wiring.
func TestRequireHumanActor_AllowsHumanRequest(t *testing.T) {
called := false
next := http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
called = true
w.WriteHeader(http.StatusOK)
})
mw := RequireHumanActor(next)
req := httptest.NewRequest(http.MethodGet, "/api/cloud-billing/balance", nil)
// No X-Actor-Source — this is the JWT / mul_ PAT shape.
w := httptest.NewRecorder()
mw.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status = %d, want 200", w.Code)
}
if !called {
t.Fatal("inner handler must run for non-task-token requests")
}
}
// TestRequireHumanActor_BlocksMachineCredentials walks every machine-
// credential X-Actor-Source value the auth middlewares stamp today
// and confirms each is rejected with 403. The two values must stay
// in lockstep with auth.go and daemon_auth.go: a new machine
// credential added there without a corresponding case here would
// silently grant agents/nodes account-level access.
func TestRequireHumanActor_BlocksMachineCredentials(t *testing.T) {
cases := []struct {
name string
actorSource string
}{
// mat_ task token — set in middleware/auth.go's mat_ branch.
// An agent process holding its task-scoped token must not be
// able to read its owner's billing data.
{name: "task_token", actorSource: "task_token"},
// mcn_ cloud-node PAT — set in BOTH middleware/auth.go and
// middleware/daemon_auth.go's mcn_ branches. A cloud-runtime
// EC2 node operating on the owner's behalf is the same kind
// of machine credential as mat_ for billing-authorization
// purposes.
{name: "cloud_pat", actorSource: "cloud_pat"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
next := http.HandlerFunc(func(_ http.ResponseWriter, _ *http.Request) {
t.Fatalf("inner handler must NOT run for actor source %q", tc.actorSource)
})
mw := RequireHumanActor(next)
req := httptest.NewRequest(http.MethodGet, "/api/cloud-billing/balance", nil)
// This is what the Auth (or DaemonAuth) middleware sets
// for the matching token kind. Setting it directly here
// proves the gate triggers on the header regardless of
// upstream context — the auth middlewares strip any
// client-supplied value before stamping their own, so a
// non-empty value at this point IS authoritative.
req.Header.Set("X-Actor-Source", tc.actorSource)
w := httptest.NewRecorder()
mw.ServeHTTP(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("status = %d, want 403", w.Code)
}
})
}
}
// TestRequireHumanActor_IgnoresUnknownActorSource pins the gate's
// scope: it is an explicit denylist against the known-bad
// "task_token" value, NOT an allowlist against "human only / empty".
//
// Why the denylist shape:
//
// - The Auth middleware today sets X-Actor-Source for exactly one
// case: mat_ task tokens. Every other authenticated path (JWT,
// mul_ PAT) leaves the header empty. So "non-empty AND not
// task_token" is unreachable in current production.
//
// - If a future actor kind is added (say a hypothetical
// `service_account` token), this gate's silence on the new value
// is a CONSCIOUS DECISION POINT, not an accident. The added auth
// branch is the right place to decide whether the new kind should
// be allowed at billing endpoints — and that decision belongs in
// a security review at the time, not in a default-deny rule here
// that pre-emptively shuts out hypothetical use cases we cannot
// reason about today.
//
// If you are reading this comment because a new actor kind needs to
// reach billing or needs to be blocked from it, update
// RequireHumanActor to handle the new kind explicitly (and update
// this test's expectation accordingly).
func TestRequireHumanActor_IgnoresUnknownActorSource(t *testing.T) {
called := false
next := http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
called = true
w.WriteHeader(http.StatusOK)
})
mw := RequireHumanActor(next)
// A hypothetical future value the gate doesn't know about.
req := httptest.NewRequest(http.MethodGet, "/api/cloud-billing/balance", nil)
req.Header.Set("X-Actor-Source", "future_kind")
w := httptest.NewRecorder()
mw.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status = %d, want 200 — gate should only block exact 'task_token'", w.Code)
}
if !called {
t.Fatal("inner handler must run for unknown actor sources")
}
}
// TestRequireHumanActor_AppliedViaChiRouterUse pins the wiring side of
// the contract: when the guard is attached to a chi route group via
// r.Use, every endpoint in that group is protected, and a task-token
// request never reaches the handler — even one we add later. This is
// what router.go's r.Route("/api/cloud-billing", ...) + r.Use(...)
// guarantees in production; the test is small but a developer adding
// a new billing endpoint and forgetting to re-attach the middleware
// would not be caught by the per-handler tests above.
func TestRequireHumanActor_AppliedViaChiRouterUse(t *testing.T) {
// Use a real chi router so we exercise r.Use(), not just the
// middleware function in isolation.
r := chi.NewRouter()
r.Use(RequireHumanActor)
r.Get("/billing/probe", func(_ http.ResponseWriter, _ *http.Request) {
t.Fatal("inner handler must NOT run when guard rejects")
})
req := httptest.NewRequest(http.MethodGet, "/billing/probe", nil)
req.Header.Set("X-Actor-Source", "task_token")
w := httptest.NewRecorder()
r.ServeHTTP(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("status = %d, want 403", w.Code)
}
}

View File

@@ -0,0 +1,294 @@
package handler
import (
"errors"
"io"
"net/http"
"github.com/go-chi/chi/v5"
"github.com/multica-ai/multica/server/internal/cloudruntime"
)
// Cloud billing endpoints proxy to the same multica-cloud HTTP service
// that backs cloud-runtime (Fleet and Billing share `:8080` per the
// upstream README). All paths here forward verbatim to /api/v1/billing/*
// on the cloud side, mirroring the cloud-runtime handler shape:
//
// - User-facing endpoints sit under /api/cloud-billing/* in our router
// and require the regular Auth middleware. We inject the resolved
// user_id as `X-User-ID` so the cloud side can scope owner queries.
//
// - The Stripe webhook is the one outlier: it lives at
// /api/webhooks/stripe (outside the Auth group), takes the raw
// request body byte-for-byte, and forwards `Stripe-Signature` for
// the cloud side to verify against its STRIPE_WEBHOOK_SECRET. The
// upstream contract is explicit about this:
// "webhook 使用原始请求体进行签名校验,不要在反向代理里改写 body."
//
// All proxy paths share `proxyCloudRuntime` for the standard JSON
// shape and only the webhook needs a custom raw-body forwarder.
// maxStripeWebhookBodySize bounds the raw body we'll forward upstream.
// Stripe's documented event payload upper bound is well under this;
// the cap exists to keep a malicious sender from making us read
// arbitrary memory before the upstream gets to reject the signature.
const maxStripeWebhookBodySize = 1 << 20 // 1 MiB
// stripeSignatureHeader is the canonical name of the header Stripe
// uses to ship its HMAC over the raw body. We forward whatever the
// client sent verbatim; the cloud side is the one that knows the
// shared secret and rejects on mismatch.
const stripeSignatureHeader = "Stripe-Signature"
// GetCloudBillingBalance forwards GET /api/v1/billing/balance.
//
// Returns the caller's wallet balance. Cloud reads `X-User-ID`; we
// stamp it from the authenticated context.
func (h *Handler) GetCloudBillingBalance(w http.ResponseWriter, r *http.Request) {
h.proxyCloudRuntime(w, r, http.MethodGet, "/api/v1/billing/balance", cloudRuntimeProxyOptions{
withUserID: true,
})
}
// ListCloudBillingTransactions forwards GET /api/v1/billing/transactions.
//
// The upstream supports `page` / `page_size`; we forward the query
// string unchanged.
func (h *Handler) ListCloudBillingTransactions(w http.ResponseWriter, r *http.Request) {
h.proxyCloudRuntime(w, r, http.MethodGet, "/api/v1/billing/transactions", cloudRuntimeProxyOptions{
withUserID: true,
withQuery: true,
})
}
// ListCloudBillingBatches forwards GET /api/v1/billing/batches.
//
// Returns paginated topup / bonus batches for the owner; same query
// shape as transactions.
func (h *Handler) ListCloudBillingBatches(w http.ResponseWriter, r *http.Request) {
h.proxyCloudRuntime(w, r, http.MethodGet, "/api/v1/billing/batches", cloudRuntimeProxyOptions{
withUserID: true,
withQuery: true,
})
}
// ListCloudBillingTopups forwards GET /api/v1/billing/topups.
func (h *Handler) ListCloudBillingTopups(w http.ResponseWriter, r *http.Request) {
h.proxyCloudRuntime(w, r, http.MethodGet, "/api/v1/billing/topups", cloudRuntimeProxyOptions{
withUserID: true,
withQuery: true,
})
}
// ListCloudBillingPriceTiers forwards GET /api/v1/billing/price-tiers.
//
// Per the upstream doc, this endpoint requires `X-User-ID` (it sits
// under the same auth fence as the rest of /api/v1/billing/*), even
// though the response is the same for every owner today. We stamp the
// header so cloud can audit who's listing tiers — and so the contract
// stays uniform if pricing later differentiates per-customer.
func (h *Handler) ListCloudBillingPriceTiers(w http.ResponseWriter, r *http.Request) {
h.proxyCloudRuntime(w, r, http.MethodGet, "/api/v1/billing/price-tiers", cloudRuntimeProxyOptions{
withUserID: true,
})
}
// CreateCloudBillingCheckoutSession forwards POST /api/v1/billing/checkout-sessions.
//
// Body shape (per upstream): `{tier_id, customer_email?}`. We don't
// care about its contents — proxyCloudRuntime validates only that
// it's syntactically JSON and forwards the bytes.
func (h *Handler) CreateCloudBillingCheckoutSession(w http.ResponseWriter, r *http.Request) {
h.proxyCloudRuntime(w, r, http.MethodPost, "/api/v1/billing/checkout-sessions", cloudRuntimeProxyOptions{
withUserID: true,
withBody: true,
})
}
// GetCloudBillingCheckoutSession forwards GET /api/v1/billing/checkout-sessions/{session_id}.
//
// The path param goes into the upstream URL; if missing we return 400
// before doing any network work.
func (h *Handler) GetCloudBillingCheckoutSession(w http.ResponseWriter, r *http.Request) {
sessionID := chi.URLParam(r, "sessionId")
if sessionID == "" {
writeError(w, http.StatusBadRequest, "session_id is required")
return
}
// Stripe Checkout session IDs are `cs_<base62>` by construction —
// alphanumeric plus underscore, nothing else. We splice the value
// straight into the upstream URL path, so anything outside that
// alphabet could in principle introduce path / query / fragment
// segments that re-target the request. Allow-list rather than
// deny-list: a future Stripe-format bump cannot quietly slip a
// new "harmful" character past us, it just rejects until we
// widen the rule consciously.
if !isValidStripeSessionID(sessionID) {
writeError(w, http.StatusBadRequest, "invalid session_id")
return
}
h.proxyCloudRuntime(w, r, http.MethodGet, "/api/v1/billing/checkout-sessions/"+sessionID, cloudRuntimeProxyOptions{
withUserID: true,
})
}
// isValidStripeSessionID reports whether s is a syntactically
// plausible Stripe Checkout session id: a non-empty string of
// `[A-Za-z0-9_]`. Stripe's own format is `cs_<base62>`, but we
// accept the broader alphanumeric+underscore set because (a) it
// covers every Stripe ID variant that might ever route here and
// (b) it stays in lockstep with Stripe's own ID grammar without
// hard-coding the `cs_` prefix.
func isValidStripeSessionID(s string) bool {
if s == "" {
return false
}
for _, c := range s {
switch {
case c >= 'a' && c <= 'z':
case c >= 'A' && c <= 'Z':
case c >= '0' && c <= '9':
case c == '_':
default:
return false
}
}
return true
}
// CreateCloudBillingPortalSession forwards POST /api/v1/billing/portal-sessions.
//
// Body shape is upstream-defined; can be empty. We treat it as
// optional JSON: cloud_runtime helper rejects empty bodies on
// withBody=true, so for portal-sessions we explicitly do NOT mark
// withBody and also send no body upstream. If the upstream contract
// later requires a body, switch this to withBody and let cloud
// validate.
func (h *Handler) CreateCloudBillingPortalSession(w http.ResponseWriter, r *http.Request) {
h.proxyCloudRuntime(w, r, http.MethodPost, "/api/v1/billing/portal-sessions", cloudRuntimeProxyOptions{
withUserID: true,
})
}
// HandleCloudBillingStripeWebhook is the public ingress for Stripe
// webhook deliveries. Three things are critical here and are *not*
// shared with the standard proxyCloudRuntime path:
//
// 1. NO authentication. Stripe POSTs from its own infrastructure;
// we don't have a user context and don't try to invent one.
// Application-layer auth is replaced by Stripe's HMAC signature,
// which the upstream cloud service verifies. This route therefore
// sits OUTSIDE the Auth group in router.go.
//
// 2. The body is forwarded byte-for-byte. Stripe's signature is
// computed over the exact bytes it sent. We must NOT json.Unmarshal
// /re-marshal, trim whitespace, or otherwise touch the payload —
// even a single byte difference fails verification.
//
// 3. The `Stripe-Signature` header is forwarded verbatim. That's
// the entire authentication channel from Stripe to the upstream.
//
// We DO bound the read with MaxBytesReader so a malicious or
// misconfigured sender can't make us buffer arbitrary memory before
// the upstream rejects on signature.
//
// Two pre-checks happen BEFORE we read the body:
//
// - Per-IP rate limit (mirrors HandleAutopilotWebhook). The
// endpoint is public; without rate limiting an attacker spraying
// bogus payloads forces an upstream round-trip per request and
// burns the cloud's webhook handling budget. A spray of bad
// signatures still counts toward this limit, so the fast-path
// 429 stops the bleed.
//
// - Mandatory `Stripe-Signature` header. Real Stripe deliveries
// always include it; absence ≡ not from Stripe. Returning 401
// locally saves an upstream RTT on the most common kind of
// unauthorized poke (curl from a script kiddie). This is a strict
// superset of what the upstream would do — Cloud also rejects
// missing-signature with 401 — so it does not change Stripe's
// own delivery dashboard view.
func (h *Handler) HandleCloudBillingStripeWebhook(w http.ResponseWriter, r *http.Request) {
if h.CloudRuntime == nil || !h.CloudRuntime.Enabled() {
writeError(w, http.StatusServiceUnavailable, "cloud runtime is not configured")
return
}
// Per-IP rate limit BEFORE reading the body or hitting upstream.
// We deliberately reuse the same limiter as the autopilot
// webhook: both are public unauthenticated ingress with the same
// abuse profile, and budgeting them together gives a single knob
// to tune.
if h.WebhookIPRateLimiter != nil {
if ip := h.clientIPForRateLimit(r); ip != "" {
if !h.WebhookIPRateLimiter.Allow(r.Context(), ip) {
writeError(w, http.StatusTooManyRequests, "rate limit exceeded")
return
}
}
}
// Real Stripe deliveries always include Stripe-Signature; the
// absence is a confident "not from Stripe" signal. Reject locally
// to save the upstream RTT. NOTE: we use Header.Values to detect
// presence rather than Get, so a header explicitly set to "" still
// counts as missing (matches the upstream's interpretation).
if len(r.Header.Values(stripeSignatureHeader)) == 0 {
writeError(w, http.StatusUnauthorized, "missing Stripe-Signature header")
return
}
// Body cap matches the JSON proxy. Stripe's documented payload
// ceiling is much smaller; the limit is defense, not contract.
r.Body = http.MaxBytesReader(w, r.Body, maxStripeWebhookBodySize)
// io.ReadAll is appropriate here because the webhook body is
// fully consumed before forwarding (Stripe signs the bytes; we
// can't stream). Unlike the JSON proxy we deliberately do NOT
// trim whitespace or json-validate — the upstream signature
// check is computed over exactly what we received, so any
// transformation here would silently break verification.
body, err := io.ReadAll(r.Body)
if err != nil {
var maxErr *http.MaxBytesError
if errors.As(err, &maxErr) {
writeError(w, http.StatusRequestEntityTooLarge, "request body is too large")
return
}
writeError(w, http.StatusBadRequest, "invalid request body")
return
}
// Forward Stripe-Signature verbatim, plus Stripe's original
// Content-Type (so the upstream sees the exact same header set
// it would if Stripe were calling it directly). cloudruntime's
// default Content-Type would override `application/json;
// charset=utf-8` to plain `application/json`; signature
// verification doesn't care (HMAC is over the body), but
// preserving the exact header is cheap and removes a debug-time
// "why does this header look different?" surprise. Caller-
// supplied Headers in cloudruntime.Request override defaults, so
// putting Content-Type here is enough.
headers := http.Header{}
if sigs := r.Header.Values(stripeSignatureHeader); len(sigs) > 0 {
headers[stripeSignatureHeader] = sigs
}
if cts := r.Header.Values("Content-Type"); len(cts) > 0 {
headers["Content-Type"] = cts
}
resp, err := h.CloudRuntime.Do(r.Context(), cloudruntime.Request{
Method: http.MethodPost,
Path: "/api/v1/webhooks/stripe",
Body: body,
Headers: headers,
RequestID: cloudRuntimeRequestID(r),
// Intentionally no UserID — webhook is unauthenticated by
// design; injecting an empty header would still be harmless,
// but staying explicit makes the contract obvious to readers.
})
if err != nil {
writeCloudRuntimeError(w, r, err)
return
}
writeCloudRuntimeResponse(w, resp)
}

View File

@@ -0,0 +1,439 @@
package handler
import (
"bytes"
"context"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/multica-ai/multica/server/internal/cloudruntime"
)
// proxyExpectation captures the assertions every standard
// cloud-billing endpoint shares: it must call the cloud proxy with a
// specific method/path, must stamp X-User-ID from the authenticated
// context, and must return the upstream response untouched.
//
// Reusing this table-driven helper keeps the per-endpoint tests small
// — the interesting per-endpoint logic lives in `withQuery` /
// `withBody` / dynamic-path-param branches.
type billingProxyCase struct {
name string
method string
path string // path on OUR router, e.g. /api/cloud-billing/balance
body any // nil for GET; map / struct for POST bodies
wantPx string // expected upstream path
wantQ string // expected upstream query (encoded form), "" if none
invoke func(t *testing.T, w http.ResponseWriter, r *http.Request)
}
// TestCloudBillingProxiesForwardCorrectly walks every standard
// endpoint at once: each one must hit the right upstream path with
// the right method and the caller's user id. Single test = single
// stub configuration; we just rotate which handler we invoke. This
// is the cheapest way to keep all 7 standard endpoints covered
// without duplicating the proxy plumbing per test.
func TestCloudBillingProxiesForwardCorrectly(t *testing.T) {
cases := []billingProxyCase{
{
name: "balance",
method: http.MethodGet,
path: "/api/cloud-billing/balance",
wantPx: "/api/v1/billing/balance",
invoke: func(t *testing.T, w http.ResponseWriter, r *http.Request) {
testHandler.GetCloudBillingBalance(w, r)
},
},
{
name: "transactions list with paging",
method: http.MethodGet,
path: "/api/cloud-billing/transactions?page=2&page_size=50",
wantPx: "/api/v1/billing/transactions",
wantQ: "page=2&page_size=50",
invoke: func(t *testing.T, w http.ResponseWriter, r *http.Request) {
testHandler.ListCloudBillingTransactions(w, r)
},
},
{
name: "batches list",
method: http.MethodGet,
path: "/api/cloud-billing/batches?page_size=10",
wantPx: "/api/v1/billing/batches",
wantQ: "page_size=10",
invoke: func(t *testing.T, w http.ResponseWriter, r *http.Request) {
testHandler.ListCloudBillingBatches(w, r)
},
},
{
name: "topups list",
method: http.MethodGet,
path: "/api/cloud-billing/topups",
wantPx: "/api/v1/billing/topups",
invoke: func(t *testing.T, w http.ResponseWriter, r *http.Request) {
testHandler.ListCloudBillingTopups(w, r)
},
},
{
name: "price tiers",
method: http.MethodGet,
path: "/api/cloud-billing/price-tiers",
wantPx: "/api/v1/billing/price-tiers",
invoke: func(t *testing.T, w http.ResponseWriter, r *http.Request) {
testHandler.ListCloudBillingPriceTiers(w, r)
},
},
{
name: "create checkout session",
method: http.MethodPost,
path: "/api/cloud-billing/checkout-sessions",
body: map[string]any{"tier_id": "starter"},
wantPx: "/api/v1/billing/checkout-sessions",
invoke: func(t *testing.T, w http.ResponseWriter, r *http.Request) {
testHandler.CreateCloudBillingCheckoutSession(w, r)
},
},
{
name: "create portal session",
method: http.MethodPost,
path: "/api/cloud-billing/portal-sessions",
wantPx: "/api/v1/billing/portal-sessions",
invoke: func(t *testing.T, w http.ResponseWriter, r *http.Request) {
testHandler.CreateCloudBillingPortalSession(w, r)
},
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{
enabled: true,
resp: &cloudruntime.Response{
StatusCode: http.StatusOK,
Body: []byte(`{"ok":true}`),
},
}
useCloudRuntimeProxy(t, proxy)
req := newRequest(tc.method, tc.path, tc.body)
w := httptest.NewRecorder()
tc.invoke(t, w, req)
if w.Code != http.StatusOK {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if !proxy.called {
t.Fatal("expected cloud proxy to be called")
}
if proxy.req.Method != tc.method {
t.Errorf("upstream method = %s, want %s", proxy.req.Method, tc.method)
}
if proxy.req.Path != tc.wantPx {
t.Errorf("upstream path = %s, want %s", proxy.req.Path, tc.wantPx)
}
if proxy.req.UserID != testUserID {
t.Errorf("upstream user_id = %q, want %q", proxy.req.UserID, testUserID)
}
if got := proxy.req.Query.Encode(); got != tc.wantQ {
t.Errorf("upstream query = %q, want %q", got, tc.wantQ)
}
// Body should be present on POST cases and absent on GET.
if tc.method == http.MethodPost && tc.body != nil && len(proxy.req.Body) == 0 {
t.Error("expected upstream body on POST, got empty")
}
if tc.method == http.MethodGet && len(proxy.req.Body) > 0 {
t.Errorf("upstream body should be empty on GET, got %s", proxy.req.Body)
}
})
}
}
// TestGetCloudBillingCheckoutSession_AppendsSessionIDToPath pins the
// dynamic-path handler. The session id flows from chi URL param into
// the upstream URL, and the upstream therefore sees a different path
// than every other billing endpoint — easy to break by accident.
func TestGetCloudBillingCheckoutSession_AppendsSessionIDToPath(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{
enabled: true,
resp: &cloudruntime.Response{
StatusCode: http.StatusOK,
Body: []byte(`{"order_id":"o","status":"credited"}`),
},
}
useCloudRuntimeProxy(t, proxy)
req := newRequest(http.MethodGet, "/api/cloud-billing/checkout-sessions/cs_test_abc", nil)
req = withURLParam(req, "sessionId", "cs_test_abc")
w := httptest.NewRecorder()
testHandler.GetCloudBillingCheckoutSession(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if proxy.req.Path != "/api/v1/billing/checkout-sessions/cs_test_abc" {
t.Errorf("upstream path = %s, want /api/v1/billing/checkout-sessions/cs_test_abc", proxy.req.Path)
}
if proxy.req.UserID != testUserID {
t.Errorf("upstream user_id = %q", proxy.req.UserID)
}
}
// TestGetCloudBillingCheckoutSession_RejectsPathTraversal pins the
// defensive bail when the session_id contains characters that would
// alter URL semantics. The cloud-runtime client rejects paths missing
// the leading slash but does not otherwise sanitize, so a stray `/`
// here would re-target the upstream request.
func TestGetCloudBillingCheckoutSession_RejectsPathTraversal(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{enabled: true}
useCloudRuntimeProxy(t, proxy)
for _, sessionID := range []string{
"cs_test/../admin",
"cs?inject=1",
"cs#frag",
} {
t.Run(sessionID, func(t *testing.T) {
req := newRequest(http.MethodGet, "/api/cloud-billing/checkout-sessions/x", nil)
req = withURLParam(req, "sessionId", sessionID)
w := httptest.NewRecorder()
testHandler.GetCloudBillingCheckoutSession(w, req)
if w.Code != http.StatusBadRequest {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if proxy.called {
t.Fatal("upstream must not be called for invalid session_id")
}
})
}
}
// TestGetCloudBillingCheckoutSession_MissingPathParamReturns400 pins
// the no-id branch (defensive — chi shouldn't route to us without a
// param, but we guard anyway).
func TestGetCloudBillingCheckoutSession_MissingPathParamReturns400(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{enabled: true}
useCloudRuntimeProxy(t, proxy)
req := newRequest(http.MethodGet, "/api/cloud-billing/checkout-sessions/", nil)
// No URL param injected.
w := httptest.NewRecorder()
testHandler.GetCloudBillingCheckoutSession(w, req)
if w.Code != http.StatusBadRequest {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if proxy.called {
t.Fatal("upstream must not be called when session_id is missing")
}
}
// TestCloudBillingDisabledReturnsUnavailable confirms self-hosted
// deployments (no cloud URL configured) get a clean 503 rather than
// a cryptic upstream error.
func TestCloudBillingDisabledReturnsUnavailable(t *testing.T) {
useCloudRuntimeProxy(t, &fakeCloudRuntimeProxy{enabled: false})
req := newRequest(http.MethodGet, "/api/cloud-billing/balance", nil)
w := httptest.NewRecorder()
testHandler.GetCloudBillingBalance(w, req)
if w.Code != http.StatusServiceUnavailable {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
}
// --- Stripe webhook ---
// TestStripeWebhookForwardsRawBodyAndSignature is the critical
// invariant for the webhook proxy: bytes go upstream byte-for-byte,
// and the Stripe-Signature header rides along. Even one stray
// transformation breaks Stripe's HMAC verification on the cloud side
// and leaves topups stuck in `pending` forever.
//
// We deliberately use a body that includes leading whitespace, a
// trailing newline, and unusual key ordering to catch any
// json.Unmarshal/Marshal round-trip the JSON proxy might
// inadvertently apply.
func TestStripeWebhookForwardsRawBodyAndSignature(t *testing.T) {
rawBody := " \n{\"id\":\"evt_test\",\"type\":\"checkout.session.completed\"}\n"
const sig = "t=1700000000,v1=deadbeef0000aaaa"
const ct = "application/json; charset=utf-8"
proxy := &fakeCloudRuntimeProxy{
enabled: true,
resp: &cloudruntime.Response{
StatusCode: http.StatusOK,
Body: []byte(`{"received":true}`),
},
}
useCloudRuntimeProxy(t, proxy)
req := httptest.NewRequest(http.MethodPost, "/api/webhooks/stripe", strings.NewReader(rawBody))
req.Header.Set("Stripe-Signature", sig)
req.Header.Set("Content-Type", ct)
// Deliberately NO X-User-ID — the webhook must work without auth.
w := httptest.NewRecorder()
testHandler.HandleCloudBillingStripeWebhook(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if !proxy.called {
t.Fatal("upstream proxy must be called")
}
if proxy.req.Method != http.MethodPost || proxy.req.Path != "/api/v1/webhooks/stripe" {
t.Fatalf("upstream %s %s", proxy.req.Method, proxy.req.Path)
}
if string(proxy.req.Body) != rawBody {
t.Fatalf("upstream body = %q, want %q (byte-perfect)", proxy.req.Body, rawBody)
}
if got := proxy.req.Headers.Get("Stripe-Signature"); got != sig {
t.Fatalf("upstream Stripe-Signature = %q, want %q", got, sig)
}
// Content-Type must arrive verbatim — preserving the
// `; charset=utf-8` suffix Stripe always sends. cloudruntime's
// default would otherwise strip it to plain `application/json`.
if got := proxy.req.Headers.Get("Content-Type"); got != ct {
t.Fatalf("upstream Content-Type = %q, want %q", got, ct)
}
if proxy.req.UserID != "" {
t.Errorf("upstream user_id should be empty for webhook, got %q", proxy.req.UserID)
}
}
// TestStripeWebhookMissingSignatureRejectedLocally pins the early
// 401 we now return when Stripe-Signature is absent. Real Stripe
// deliveries ALWAYS include the header; absence ≡ not from Stripe.
// Rejecting locally saves the upstream RTT (and prevents using us
// as a DoS amplifier against the cloud Billing service).
//
// The contract is documented in HandleCloudBillingStripeWebhook:
// our local 401 is a strict superset of what the upstream would do
// in this case, so Stripe's delivery dashboard sees the same
// outcome it would have if the request had reached cloud.
func TestStripeWebhookMissingSignatureRejectedLocally(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{enabled: true}
useCloudRuntimeProxy(t, proxy)
req := httptest.NewRequest(http.MethodPost, "/api/webhooks/stripe",
strings.NewReader(`{"id":"evt"}`))
w := httptest.NewRecorder()
testHandler.HandleCloudBillingStripeWebhook(w, req)
if w.Code != http.StatusUnauthorized {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if proxy.called {
t.Fatal("upstream must NOT be called when Stripe-Signature is missing")
}
}
// TestStripeWebhookForwardsEmptyBody confirms we don't pre-reject an
// empty body — Stripe's webhook tester sometimes sends pings, and the
// upstream is the source of truth for what's an acceptable payload.
// (We do still cap large bodies; that's a separate test.) The
// signature header is set because, post-fix, the absence of it is
// itself a 401 — see TestStripeWebhookMissingSignatureRejectedLocally.
func TestStripeWebhookForwardsEmptyBody(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{
enabled: true,
resp: &cloudruntime.Response{
StatusCode: http.StatusBadRequest,
Body: []byte(`{"error":"empty body"}`),
},
}
useCloudRuntimeProxy(t, proxy)
req := httptest.NewRequest(http.MethodPost, "/api/webhooks/stripe", http.NoBody)
req.Header.Set("Stripe-Signature", "t=1,v1=deadbeef")
w := httptest.NewRecorder()
testHandler.HandleCloudBillingStripeWebhook(w, req)
if !proxy.called {
t.Fatal("upstream must be called even on empty body")
}
if len(proxy.req.Body) != 0 {
t.Errorf("upstream body = %q, want empty", proxy.req.Body)
}
}
// TestStripeWebhookRejectsLargeBody pins the body cap. Stripe's
// real payloads are well under 1 MiB; an attacker (or a misconfigured
// sender) flooding us with multi-MB bodies must be cut off before we
// buffer the whole thing in memory, and before we spend a Cloud
// upstream round-trip on a doomed verification.
func TestStripeWebhookRejectsLargeBody(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{enabled: true}
useCloudRuntimeProxy(t, proxy)
body := bytes.NewReader(bytes.Repeat([]byte("a"), maxStripeWebhookBodySize+1))
req := httptest.NewRequest(http.MethodPost, "/api/webhooks/stripe", body)
req.Header.Set("Stripe-Signature", "t=1,v1=deadbeef")
w := httptest.NewRecorder()
testHandler.HandleCloudBillingStripeWebhook(w, req)
if w.Code != http.StatusRequestEntityTooLarge {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if proxy.called {
t.Fatal("upstream must not be called for oversized webhook body")
}
}
// TestStripeWebhookDisabledReturnsUnavailable mirrors the
// cloud-runtime disabled test but for the webhook path. Self-hosted
// deployments without a cloud URL must return 503, not crash.
func TestStripeWebhookDisabledReturnsUnavailable(t *testing.T) {
useCloudRuntimeProxy(t, &fakeCloudRuntimeProxy{enabled: false})
req := httptest.NewRequest(http.MethodPost, "/api/webhooks/stripe",
strings.NewReader(`{"id":"evt"}`))
req.Header.Set("Stripe-Signature", "t=1,v1=deadbeef")
w := httptest.NewRecorder()
testHandler.HandleCloudBillingStripeWebhook(w, req)
if w.Code != http.StatusServiceUnavailable {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
}
// TestStripeWebhookRateLimited pins the per-IP rate-limit fast
// path. With a denying limiter installed the handler must 429
// BEFORE consulting upstream. Mirrors HandleAutopilotWebhook's
// behaviour: the public webhook ingress sits behind the same
// WebhookIPRateLimiter so a flood of bogus requests doesn't burn
// cloud-side budget.
func TestStripeWebhookRateLimited(t *testing.T) {
proxy := &fakeCloudRuntimeProxy{enabled: true}
useCloudRuntimeProxy(t, proxy)
prevLimiter := testHandler.WebhookIPRateLimiter
testHandler.WebhookIPRateLimiter = denyingWebhookIPRateLimiter{}
t.Cleanup(func() { testHandler.WebhookIPRateLimiter = prevLimiter })
req := httptest.NewRequest(http.MethodPost, "/api/webhooks/stripe",
strings.NewReader(`{"id":"evt"}`))
req.Header.Set("Stripe-Signature", "t=1,v1=deadbeef")
req.RemoteAddr = "203.0.113.7:1234"
w := httptest.NewRecorder()
testHandler.HandleCloudBillingStripeWebhook(w, req)
if w.Code != http.StatusTooManyRequests {
t.Fatalf("status = %d, body = %s", w.Code, w.Body.String())
}
if proxy.called {
t.Fatal("upstream must not be called when rate limited")
}
}
// denyingWebhookIPRateLimiter is the smallest possible limiter that
// always says "no". It exists to drive the 429 branch without
// requiring a Redis test instance — the limiter interface is the
// same one HandleAutopilotWebhook uses.
type denyingWebhookIPRateLimiter struct{}
func (denyingWebhookIPRateLimiter) Allow(_ context.Context, _ string) bool { return false }

View File

@@ -137,6 +137,18 @@ func Auth(queries *db.Queries, patCache *auth.PATCache, cloudPAT *auth.CloudPATV
return
}
r.Header.Set("X-User-ID", identity.OwnerID)
// Tag the auth path so account-level guards (e.g.
// handler.RequireHumanActor on /api/cloud-billing/*)
// can distinguish a cloud-node machine credential
// from a human PAT/JWT. Mirrors the mat_ branch's
// stamp of "task_token" — both are server-set,
// authoritative, and stripped from any client-
// supplied value at the top of this middleware. Same
// rationale as MUL-2600: a machine credential
// (running agent or running cloud node) must not be
// treated as the owner having approved an account-
// level action.
r.Header.Set("X-Actor-Source", "cloud_pat")
next.ServeHTTP(w, r)
return
}

View File

@@ -334,10 +334,11 @@ func TestAuth_MCN_ValidTokenSetsUserID(t *testing.T) {
verifier := auth.NewCloudPATVerifier(auth.CloudPATVerifierConfig{FleetBaseURL: srv.URL})
var gotUser string
var gotUser, gotActorSource string
mw := Auth(nil, nil, verifier)
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotUser = r.Header.Get("X-User-ID")
gotActorSource = r.Header.Get("X-Actor-Source")
w.WriteHeader(http.StatusOK)
}))
@@ -352,6 +353,14 @@ func TestAuth_MCN_ValidTokenSetsUserID(t *testing.T) {
if gotUser != "01972f7e-7e8d-77ef-a13d-1b0ce3e9c001" {
t.Errorf("expected owner_id propagated as X-User-ID, got %q", gotUser)
}
// Pinned per the cloud-billing review: a successful mcn_ verify
// MUST stamp X-Actor-Source so account-level guards (e.g.
// handler.RequireHumanActor on /api/cloud-billing/*) can tell a
// machine credential apart from a human PAT/JWT. Dropping this
// stamp would silently let an mcn_ holder reach billing.
if gotActorSource != "cloud_pat" {
t.Errorf("expected X-Actor-Source=cloud_pat, got %q", gotActorSource)
}
}
// TestAuth_MCN_InvalidReturns401 confirms that a Fleet valid:false maps

View File

@@ -79,6 +79,16 @@ func WithDaemonContext(ctx context.Context, workspaceID, daemonID string) contex
func DaemonAuth(queries *db.Queries, patCache *auth.PATCache, daemonCache *auth.DaemonTokenCache, cloudPAT *auth.CloudPATVerifier) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// X-Actor-Source is server-set only — strip any
// client-supplied value before any branch can re-stamp
// it. This mirrors what Auth middleware does (see auth.go
// "X-Actor-Source is server-set only..." comment) and
// keeps the contract uniform across both middlewares: a
// downstream guard like handler.RequireHumanActor can
// trust this header regardless of which auth path the
// request arrived on.
r.Header.Del("X-Actor-Source")
authHeader := r.Header.Get("Authorization")
if authHeader == "" {
slog.Debug("daemon_auth: missing authorization header", "path", r.URL.Path)
@@ -161,6 +171,16 @@ func DaemonAuth(queries *db.Queries, patCache *auth.PATCache, daemonCache *auth.
return
}
r.Header.Set("X-User-ID", identity.OwnerID)
// Mirror the regular Auth middleware: tag the auth
// path so any downstream guard (handler.
// RequireHumanActor and friends) can recognize this
// request as a machine credential rather than a
// human PAT. Daemon routes don't currently use these
// guards, but keeping the stamp uniform with Auth
// avoids a future surprise where an endpoint moved
// or shared between the two middlewares would behave
// differently depending on which one routed it.
r.Header.Set("X-Actor-Source", "cloud_pat")
ctx := context.WithValue(r.Context(), ctxKeyDaemonAuthPath, DaemonAuthPathCloudPAT)
next.ServeHTTP(w, r.WithContext(ctx))
return

View File

@@ -103,6 +103,53 @@ func TestDaemonAuth_MissingAuth(t *testing.T) {
}
}
// TestDaemonAuth_StripsClientSuppliedActorSource mirrors the
// TestAuth_StripsClientSuppliedActorSource invariant for the daemon
// auth path: a client supplying X-Actor-Source must NOT leak that
// header through to the handler. Required for parity between the
// two middlewares — the regular Auth path strips at the top, and we
// added the same strip in DaemonAuth so account-level guards (e.g.
// handler.RequireHumanActor) can trust the header regardless of
// which auth chain a request arrived on.
//
// We exercise an mdt_ token with an attempted forged X-Actor-Source.
// On the mdt_ path no actor-source stamp is added (daemon tokens
// aren't a "machine credential" in the billing sense — they're a
// runtime-bound proof for the daemon API itself), so a clean strip
// leaves the header empty downstream.
func TestDaemonAuth_StripsClientSuppliedActorSource(t *testing.T) {
rdb := newRedisTestClient(t)
cache := auth.NewDaemonTokenCache(rdb)
const rawToken = "mdt_strip_test"
hash := auth.HashToken(rawToken)
cache.Set(context.Background(), hash, auth.DaemonTokenIdentity{
WorkspaceID: "ws-1",
DaemonID: "daemon-1",
}, auth.AuthCacheTTL)
var gotActorSource string
mw := DaemonAuth(nil, nil, cache, nil)
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotActorSource = r.Header.Get("X-Actor-Source")
w.WriteHeader(http.StatusOK)
}))
req := httptest.NewRequest("POST", "/api/daemon/heartbeat", nil)
req.Header.Set("Authorization", "Bearer "+rawToken)
// Forged value the client tries to smuggle in.
req.Header.Set("X-Actor-Source", "cloud_pat")
w := httptest.NewRecorder()
handler.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
if gotActorSource != "" {
t.Fatalf("X-Actor-Source must be cleared on the mdt_ path, got %q", gotActorSource)
}
}
func TestDaemonAuth_InvalidMDT_NilQueries(t *testing.T) {
mw := DaemonAuth(nil, nil, nil, nil) // no caches, no DB
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
@@ -153,11 +200,12 @@ func TestDaemonAuth_MCN_ValidTokenSetsUserID(t *testing.T) {
verifier := auth.NewCloudPATVerifier(auth.CloudPATVerifierConfig{FleetBaseURL: srv.URL})
var gotUser, gotPath string
var gotUser, gotPath, gotActorSource string
mw := DaemonAuth(nil, nil, nil, verifier)
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotUser = r.Header.Get("X-User-ID")
gotPath = DaemonAuthPathFromContext(r.Context())
gotActorSource = r.Header.Get("X-Actor-Source")
w.WriteHeader(http.StatusOK)
}))
@@ -175,6 +223,14 @@ func TestDaemonAuth_MCN_ValidTokenSetsUserID(t *testing.T) {
if gotPath != DaemonAuthPathCloudPAT {
t.Errorf("expected auth path %q, got %q", DaemonAuthPathCloudPAT, gotPath)
}
// Mirror the regular Auth middleware's stamp. Daemon routes don't
// currently sit behind RequireHumanActor, but we want the two
// auth paths to behave identically on this header so an endpoint
// that ever moves between them, or shares both, can't be tricked
// into thinking an mcn_ caller is human.
if gotActorSource != "cloud_pat" {
t.Errorf("expected X-Actor-Source=cloud_pat, got %q", gotActorSource)
}
}
// TestDaemonAuth_MCN_FleetSaysInvalid confirms that a valid:false