mirror of
https://github.com/multica-ai/multica.git
synced 2026-07-05 21:39:54 +02:00
* feat(server/auth): cache PAT lookups in Redis with 60s TTL
Personal access tokens used to hit Postgres on every request: a SELECT
to resolve token_hash → user_id, plus a fire-and-forget UPDATE of
last_used_at. For a CLI / daemon making many requests per second this
is wasted DB load — the token is the same and the answer hasn't changed.
Add a Redis-backed cache (auth.PATCache) keyed by token hash, TTL 60s:
- On cache hit, the auth middleware skips both the SELECT and the
last_used_at UPDATE. last_used_at is now refreshed at most once per
TTL window per token, not per request.
- On cache miss the middleware falls back to today's behavior: query
Postgres, populate the cache, async-update last_used_at.
- On revoke, the handler invalidates the cache entry so revocation
takes effect immediately rather than waiting for the TTL to expire.
This required changing RevokePersonalAccessToken from :exec to :one
RETURNING token_hash.
The cache is nil-safe: when REDIS_URL isn't configured, NewPATCache
returns nil and the middleware degrades to today's always-hit-DB
behavior. JWT validation is untouched (already DB-free).
Tested with REDIS_TEST_URL — same gating pattern the rest of the
suite uses for Redis-backed tests. New tests cover nil-safety, set/
get/invalidate, TTL, and the middleware short-circuit on cache hit.
* fix(server/auth): clamp PAT cache TTL to token's remaining lifetime
GPT-Boy review caught: a PAT expiring in <60s would still be cached
for the full PATCacheTTL window, so the token could continue passing
auth on cache hit for up to ~60s after its expires_at. The DB query
filters expired tokens (revoked = FALSE AND expires_at > now()), but
that filter never ran on a cache hit.
Make Set take an explicit ttl, and add TTLForExpiry to compute it:
- no expires_at → full PATCacheTTL
- expires_at far → full PATCacheTTL
- expires_at <60s → time until expiry
- already expired → 0, Set skips caching (TOCTOU defense between
the SELECT and the Set, since the SELECT
already filters expired rows)
Regression test pins the clamp behavior end-to-end against Redis.
* feat(server/auth): cache daemon-token + PAT lookups in DaemonAuth, bump TTL to 10m
Daemon /api/daemon/* requests (heartbeat, claim task) hit DaemonAuth
which previously did its own GetDaemonTokenByHash on every request and
*also* duplicated the PAT lookup on the mul_ fallback — bypassing the
cache added in 1cdd674c. Today's daemons authenticate via mul_ PATs
(mdt_ minting isn't wired up yet), so the duplicate PAT path is the one
that actually matters for hot-path DB load.
Three changes:
1. New auth.DaemonTokenCache mirrors PATCache for the mdt_ path
(key = mul:auth:daemon:<sha256>, JSON value = {workspace_id, daemon_id}).
Forward-looking infrastructure for when daemon tokens get minted; the
middleware short-circuits the DB SELECT on cache hit. TTL clamped to
the token's expires_at via the shared TTLForExpiry helper.
2. DaemonAuth now also consults PATCache on its mul_ fallback, sharing
the same cache as the regular Auth middleware. A daemon making 4 hb/min
collapses from 4 GetPersonalAccessTokenByHash + 4 last_used_at writes
per minute to ~1 of each per AuthCacheTTL window (~10 minutes).
3. Rename PATCacheTTL → AuthCacheTTL and bump from 60s to 10 minutes.
The constant is now shared between PAT and daemon caches; 10m matches
the user-requested longer TTL for further DB write reduction. Revoke
latency on the happy path is still instant via active invalidation;
the worst-case (Redis Del miss / direct-DB revoke) grows from ~60s to
~10m.
Tests cover nil-safety, set/get/invalidate, TTL, clamped TTL on near-
expiry tokens, and the middleware short-circuit for both cache paths
(mdt_ via DaemonTokenCache, mul_ fallback via PATCache).
* feat(server/auth): cache PAT lookups on the WebSocket auth path
The third place a PAT is resolved — patResolver.ResolveToken used by
realtime.HandleWebSocket — was still hitting Postgres on every /ws
auth and firing an unconditional last_used_at UPDATE, bypassing the
cache added in 1cdd674c. Wire it through the same shared PATCache so
revoking a token through any path (Auth middleware, DaemonAuth PAT
fallback, or WS auth) hits all three caches with one Invalidate.
Also leaves a comment on DeleteDaemonTokensByWorkspaceAndDaemon —
the query has no caller today, but a future deregister/rotate flow
must remember to call DaemonTokenCache.Invalidate(hash) for each
deleted row, otherwise deleted daemon tokens stay valid until TTL.
100 lines
3.3 KiB
Go
100 lines
3.3 KiB
Go
package auth
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
"errors"
|
|
"log/slog"
|
|
"time"
|
|
|
|
"github.com/redis/go-redis/v9"
|
|
)
|
|
|
|
// daemonTokenCachePrefix namespaces daemon-token cache keys separately
|
|
// from PAT (mul:auth:pat:*) so the two key spaces can't collide and an
|
|
// invalidation on one kind of token doesn't accidentally hit the other.
|
|
const daemonTokenCachePrefix = "mul:auth:daemon:"
|
|
|
|
// DaemonTokenIdentity is what DaemonAuth needs from the cached lookup —
|
|
// the workspace_id and daemon_id that the middleware injects into the
|
|
// request context. We deliberately omit token_hash, expires_at, and the
|
|
// row id; cache entries should leak the minimum.
|
|
type DaemonTokenIdentity struct {
|
|
WorkspaceID string `json:"w"`
|
|
DaemonID string `json:"d"`
|
|
}
|
|
|
|
// DaemonTokenCache caches resolved daemon-token (mdt_) lookups in Redis.
|
|
// A nil *DaemonTokenCache is safe to use — every method becomes a no-op
|
|
// or reports a cache miss, so single-node dev / tests with no REDIS_URL
|
|
// degrade cleanly to direct DB lookups.
|
|
type DaemonTokenCache struct {
|
|
rdb *redis.Client
|
|
}
|
|
|
|
// NewDaemonTokenCache returns a cache backed by rdb. Pass nil to disable
|
|
// caching; the returned *DaemonTokenCache is safe to call but never hits
|
|
// Redis.
|
|
func NewDaemonTokenCache(rdb *redis.Client) *DaemonTokenCache {
|
|
if rdb == nil {
|
|
return nil
|
|
}
|
|
return &DaemonTokenCache{rdb: rdb}
|
|
}
|
|
|
|
func daemonTokenCacheKey(hash string) string { return daemonTokenCachePrefix + hash }
|
|
|
|
// Get returns the cached identity for a token hash. ok=false on cache
|
|
// miss or any Redis / decode error — a dead Redis must not take down
|
|
// auth.
|
|
func (c *DaemonTokenCache) Get(ctx context.Context, hash string) (DaemonTokenIdentity, bool) {
|
|
if c == nil {
|
|
return DaemonTokenIdentity{}, false
|
|
}
|
|
raw, err := c.rdb.Get(ctx, daemonTokenCacheKey(hash)).Bytes()
|
|
if err != nil {
|
|
if !errors.Is(err, redis.Nil) {
|
|
slog.Warn("daemon_token_cache: get failed; falling back to DB", "error", err)
|
|
}
|
|
return DaemonTokenIdentity{}, false
|
|
}
|
|
var id DaemonTokenIdentity
|
|
if err := json.Unmarshal(raw, &id); err != nil {
|
|
slog.Warn("daemon_token_cache: malformed entry; falling back to DB", "error", err)
|
|
return DaemonTokenIdentity{}, false
|
|
}
|
|
return id, true
|
|
}
|
|
|
|
// Set populates the cache with the given TTL. Use TTLForExpiry to clamp
|
|
// the TTL to the token's remaining lifetime so a daemon token expiring
|
|
// in <AuthCacheTTL can't outlive its expires_at on a cache hit.
|
|
//
|
|
// Errors are logged and swallowed — a cache write failure is not a
|
|
// request failure.
|
|
func (c *DaemonTokenCache) Set(ctx context.Context, hash string, id DaemonTokenIdentity, ttl time.Duration) {
|
|
if c == nil || ttl <= 0 {
|
|
return
|
|
}
|
|
raw, err := json.Marshal(id)
|
|
if err != nil {
|
|
slog.Warn("daemon_token_cache: marshal failed", "error", err)
|
|
return
|
|
}
|
|
if err := c.rdb.Set(ctx, daemonTokenCacheKey(hash), raw, ttl).Err(); err != nil {
|
|
slog.Warn("daemon_token_cache: set failed", "error", err)
|
|
}
|
|
}
|
|
|
|
// Invalidate removes the entry for hash. Called when a daemon token is
|
|
// deleted so the deletion takes effect immediately rather than waiting
|
|
// for the TTL.
|
|
func (c *DaemonTokenCache) Invalidate(ctx context.Context, hash string) {
|
|
if c == nil {
|
|
return
|
|
}
|
|
if err := c.rdb.Del(ctx, daemonTokenCacheKey(hash)).Err(); err != nil {
|
|
slog.Warn("daemon_token_cache: invalidate failed; entry will expire on TTL", "error", err)
|
|
}
|
|
}
|