fix(billing): block mat_ task-token actors from /api/cloud-billing/*

Adds handler.RequireHumanActor — a chi middleware that 403s any
request carrying X-Actor-Source=task_token — and applies it to the
cloud-billing route group.

Why this matters (the bug it fixes):

The general Auth middleware (server/internal/middleware/auth.go)
turns every recognized bearer format into the same shape: a stamped
X-User-ID header. JWT cookie, mul_ PAT, AND mat_ task token all
produce a non-empty X-User-ID. That uniformity is correct for issue
/ comment / chat scopes — agents acting as their owner inside a
bounded (agent, task) job is exactly what mat_ is designed for.

It is NOT correct for account-level scopes:

  * Reading balance / transactions / batches / topups list lets an
    agent see its owner's wallet state without the owner approving
    the query.

  * Creating checkout / portal sessions can move money or leak
    payment-method state. A compromised agent (prompt-injected, bad
    MCP tool, escaped quote in scratch data) could spin up a checkout
    bound to an attacker-controlled email or open a Billing Portal
    session.

Before this commit, /api/cloud-billing/* sat outside the workspace
group with only the generic Auth chain, so an mat_ task token's
stamped X-User-ID was indistinguishable from an mul_ PAT's. The
proxy happily forwarded everything to multica-cloud, where the
upstream's owner-only contract was bypassed entirely.

The fix:

  * RequireHumanActor middleware checks the SERVER-SET (not client-
    settable) X-Actor-Source header. The Auth middleware deletes any
    client-supplied value before stamping its own, so a non-empty
    'task_token' value here is authoritative — a member-token caller
    cannot forge it, and an mat_-token caller cannot strip it.

  * Wired in via r.Use(handler.RequireHumanActor) on the
    /api/cloud-billing route group. Applying it as middleware (not
    inline in each handler) means a developer adding a new billing
    endpoint cannot accidentally skip the gate.

  * Deliberately denylist-shaped, not allowlist. Future actor kinds
    added to auth.go (e.g. service-account tokens) get a conscious
    decision-point at the gate rather than being pre-emptively shut
    out by an over-broad rule today.

The Stripe webhook is untouched — it sits outside the entire Auth
group, so X-Actor-Source is never stamped on it (the auth middleware
doesn't run). The guard is irrelevant on that path.

Tests:

  * TestRequireHumanActor_AllowsHumanRequest — empty actor source
    passes through (JWT / mul_ PAT path).
  * TestRequireHumanActor_BlocksTaskToken — 'task_token' value
    triggers 403 and inner handler is not called.
  * TestRequireHumanActor_IgnoresUnknownActorSource — pins the
    denylist shape with explicit comment so the next person adding
    an actor kind knows where to revisit.
  * TestRequireHumanActor_AppliedViaChiRouterUse — small but
    important: builds a real chi router with r.Use(...) and proves
    the guard fires on every endpoint in the group, which is the
    contract router.go relies on.

go build / go vet / go test ./... clean.
This commit is contained in:
yushen
2026-05-28 15:40:28 +08:00
parent b604c6d9b0
commit 86d309dbb8
3 changed files with 233 additions and 0 deletions

View File

@@ -409,7 +409,20 @@ func NewRouterWithOptions(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus
// model is single-user; X-Workspace-ID would be ignored even
// if we sent it. The Stripe webhook is the public outlier
// and lives outside the entire Auth group (see above).
//
// IMPORTANT — task-token actors are blocked here. The Auth
// middleware happily turns an mat_ task token into a normal
// X-User-ID stamp (so agents can comment, claim issues, etc.
// as their owner), but billing is account-level and a running
// agent reading its owner's balance / opening a checkout
// session is the kind of lateral-movement we're explicitly
// trying to prevent. handler.RequireHumanActor checks the
// authoritative server-set X-Actor-Source header and 403s
// any task-token request. See actor_guards.go for the full
// rationale.
r.Route("/api/cloud-billing", func(r chi.Router) {
r.Use(handler.RequireHumanActor)
r.Get("/balance", h.GetCloudBillingBalance)
r.Get("/transactions", h.ListCloudBillingTransactions)
r.Get("/batches", h.ListCloudBillingBatches)

View File

@@ -0,0 +1,80 @@
package handler
import (
"net/http"
)
// RequireHumanActor is a chi-style middleware that rejects requests
// authenticated via an mat_ task token. It exists for endpoints whose
// authorization model is "the human owner authorized this", not
// "anyone holding the owner's credentials authorized this".
//
// Why this guard is needed (read carefully — auth here is subtle):
//
// The general Auth middleware (server/internal/middleware/auth.go)
// turns three different bearer formats into the same shape — a stamped
// `X-User-ID` header — so downstream handlers don't have to care which
// token kind the caller used:
//
// - JWT cookie / mul_ PAT → X-User-ID = the human's user id
// - mat_ task token → X-User-ID = the OWNING human's user id,
// plus X-Agent-ID, X-Task-ID, and the
// authoritative server-set header
// `X-Actor-Source: task_token`
//
// The mat_ design (MUL-2600) was deliberately built this way: every
// request the agent makes is treated as the owner's, so the agent can
// post comments, claim issues, etc., as if the owner had done it. That
// is correct for issue / comment / chat scopes — those are bounded by
// workspace membership, and the task token is bound to a specific
// (agent, task) pair so an agent can only act inside its job.
//
// It is NOT correct for account-level scopes:
//
// - Billing balance / transactions / batches / topups list
// are user-scoped. A running agent could read its owner's wallet
// state without the owner ever having approved a billing query.
//
// - Checkout / portal session creation can move money. An agent that
// gets compromised — by a prompt injection, a bad MCP tool, an
// escaped quote in scratch data — could spin up a checkout for an
// attacker-controlled email or open a Billing Portal session that
// leaks subscription / payment-method state.
//
// `X-Actor-Source` is server-set only. The Auth middleware deletes any
// client-supplied value first (see auth.go: `r.Header.Del("X-Actor-Source")`),
// then re-sets it ONLY on the mat_ branch. So checking this header is
// the safe, fast, single-source-of-truth way to know "is the request
// from an agent process?" — without re-querying the token table.
//
// We deliberately do NOT use h.resolveActor() here:
//
// - resolveActor's primary job is "agent vs member" classification
// for ownership / authorship attribution (issue creator, comment
// author, etc.). It also has a fallback path that trusts
// X-Agent-ID + X-Task-ID for legacy CLI flows; that fallback is
// valid for resolving authorship but is irrelevant here. Billing
// authorization needs the strict "task_token actor → forbidden"
// gate, nothing else.
// - resolveActor takes a workspaceID parameter; billing routes have
// no workspace context, so threading one through just to call it
// would be misleading.
//
// Apply via `r.Use(handler.RequireHumanActor)` on a chi route group.
// The middleware is intentionally NOT wired in via the router's main
// Auth chain — the default contract elsewhere (issues, chat, etc.) is
// "agent and human are interchangeable", and adding a global gate
// would break legitimate agent traffic. Only attach it where the
// scope is truly human-only.
func RequireHumanActor(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// X-Actor-Source is server-set only. The auth middleware
// strips any client-supplied value before stamping its own,
// so a non-empty "task_token" value here is authoritative.
if r.Header.Get("X-Actor-Source") == "task_token" {
writeError(w, http.StatusForbidden, "this endpoint is only available to human actors")
return
}
next.ServeHTTP(w, r)
})
}

View File

@@ -0,0 +1,140 @@
package handler
import (
"net/http"
"net/http/httptest"
"testing"
"github.com/go-chi/chi/v5"
)
// TestRequireHumanActor_AllowsHumanRequest pins the happy path: a
// request that passed Auth as a JWT or mul_ PAT does NOT carry
// X-Actor-Source, so the guard lets it through and the inner handler
// runs.
//
// We construct a bare http.Handler chain (no full router) so the test
// exercises only the middleware logic and is independent of any
// downstream wiring.
func TestRequireHumanActor_AllowsHumanRequest(t *testing.T) {
called := false
next := http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
called = true
w.WriteHeader(http.StatusOK)
})
mw := RequireHumanActor(next)
req := httptest.NewRequest(http.MethodGet, "/api/cloud-billing/balance", nil)
// No X-Actor-Source — this is the JWT / mul_ PAT shape.
w := httptest.NewRecorder()
mw.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status = %d, want 200", w.Code)
}
if !called {
t.Fatal("inner handler must run for non-task-token requests")
}
}
// TestRequireHumanActor_BlocksTaskToken pins the security-critical
// branch. An mat_ task token would have caused the Auth middleware to
// stamp X-Actor-Source=task_token; the guard MUST 403 such requests
// before they can reach a billing handler.
func TestRequireHumanActor_BlocksTaskToken(t *testing.T) {
next := http.HandlerFunc(func(_ http.ResponseWriter, _ *http.Request) {
t.Fatal("inner handler must NOT run for task-token requests")
})
mw := RequireHumanActor(next)
req := httptest.NewRequest(http.MethodGet, "/api/cloud-billing/balance", nil)
// This is what the Auth middleware sets for an mat_ token caller.
// Setting it directly here proves the gate triggers on the header
// regardless of upstream context — which is also what the auth
// middleware guarantees, since it strips client-supplied values
// before stamping its own.
req.Header.Set("X-Actor-Source", "task_token")
w := httptest.NewRecorder()
mw.ServeHTTP(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("status = %d, want 403", w.Code)
}
}
// TestRequireHumanActor_IgnoresUnknownActorSource pins the gate's
// scope: it is an explicit denylist against the known-bad
// "task_token" value, NOT an allowlist against "human only / empty".
//
// Why the denylist shape:
//
// - The Auth middleware today sets X-Actor-Source for exactly one
// case: mat_ task tokens. Every other authenticated path (JWT,
// mul_ PAT) leaves the header empty. So "non-empty AND not
// task_token" is unreachable in current production.
//
// - If a future actor kind is added (say a hypothetical
// `service_account` token), this gate's silence on the new value
// is a CONSCIOUS DECISION POINT, not an accident. The added auth
// branch is the right place to decide whether the new kind should
// be allowed at billing endpoints — and that decision belongs in
// a security review at the time, not in a default-deny rule here
// that pre-emptively shuts out hypothetical use cases we cannot
// reason about today.
//
// If you are reading this comment because a new actor kind needs to
// reach billing or needs to be blocked from it, update
// RequireHumanActor to handle the new kind explicitly (and update
// this test's expectation accordingly).
func TestRequireHumanActor_IgnoresUnknownActorSource(t *testing.T) {
called := false
next := http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
called = true
w.WriteHeader(http.StatusOK)
})
mw := RequireHumanActor(next)
// A hypothetical future value the gate doesn't know about.
req := httptest.NewRequest(http.MethodGet, "/api/cloud-billing/balance", nil)
req.Header.Set("X-Actor-Source", "future_kind")
w := httptest.NewRecorder()
mw.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status = %d, want 200 — gate should only block exact 'task_token'", w.Code)
}
if !called {
t.Fatal("inner handler must run for unknown actor sources")
}
}
// TestRequireHumanActor_AppliedViaChiRouterUse pins the wiring side of
// the contract: when the guard is attached to a chi route group via
// r.Use, every endpoint in that group is protected, and a task-token
// request never reaches the handler — even one we add later. This is
// what router.go's r.Route("/api/cloud-billing", ...) + r.Use(...)
// guarantees in production; the test is small but a developer adding
// a new billing endpoint and forgetting to re-attach the middleware
// would not be caught by the per-handler tests above.
func TestRequireHumanActor_AppliedViaChiRouterUse(t *testing.T) {
// Use a real chi router so we exercise r.Use(), not just the
// middleware function in isolation.
r := chi.NewRouter()
r.Use(RequireHumanActor)
r.Get("/billing/probe", func(_ http.ResponseWriter, _ *http.Request) {
t.Fatal("inner handler must NOT run when guard rejects")
})
req := httptest.NewRequest(http.MethodGet, "/billing/probe", nil)
req.Header.Set("X-Actor-Source", "task_token")
w := httptest.NewRecorder()
r.ServeHTTP(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("status = %d, want 403", w.Code)
}
}