mirror of
https://github.com/multica-ai/multica.git
synced 2026-06-17 11:48:42 +02:00
* feat(server/auth): cache PAT lookups in Redis with 60s TTL
Personal access tokens used to hit Postgres on every request: a SELECT
to resolve token_hash → user_id, plus a fire-and-forget UPDATE of
last_used_at. For a CLI / daemon making many requests per second this
is wasted DB load — the token is the same and the answer hasn't changed.
Add a Redis-backed cache (auth.PATCache) keyed by token hash, TTL 60s:
- On cache hit, the auth middleware skips both the SELECT and the
last_used_at UPDATE. last_used_at is now refreshed at most once per
TTL window per token, not per request.
- On cache miss the middleware falls back to today's behavior: query
Postgres, populate the cache, async-update last_used_at.
- On revoke, the handler invalidates the cache entry so revocation
takes effect immediately rather than waiting for the TTL to expire.
This required changing RevokePersonalAccessToken from :exec to :one
RETURNING token_hash.
The cache is nil-safe: when REDIS_URL isn't configured, NewPATCache
returns nil and the middleware degrades to today's always-hit-DB
behavior. JWT validation is untouched (already DB-free).
Tested with REDIS_TEST_URL — same gating pattern the rest of the
suite uses for Redis-backed tests. New tests cover nil-safety, set/
get/invalidate, TTL, and the middleware short-circuit on cache hit.
* fix(server/auth): clamp PAT cache TTL to token's remaining lifetime
GPT-Boy review caught: a PAT expiring in <60s would still be cached
for the full PATCacheTTL window, so the token could continue passing
auth on cache hit for up to ~60s after its expires_at. The DB query
filters expired tokens (revoked = FALSE AND expires_at > now()), but
that filter never ran on a cache hit.
Make Set take an explicit ttl, and add TTLForExpiry to compute it:
- no expires_at → full PATCacheTTL
- expires_at far → full PATCacheTTL
- expires_at <60s → time until expiry
- already expired → 0, Set skips caching (TOCTOU defense between
the SELECT and the Set, since the SELECT
already filters expired rows)
Regression test pins the clamp behavior end-to-end against Redis.
* feat(server/auth): cache daemon-token + PAT lookups in DaemonAuth, bump TTL to 10m
Daemon /api/daemon/* requests (heartbeat, claim task) hit DaemonAuth
which previously did its own GetDaemonTokenByHash on every request and
*also* duplicated the PAT lookup on the mul_ fallback — bypassing the
cache added in 1cdd674c. Today's daemons authenticate via mul_ PATs
(mdt_ minting isn't wired up yet), so the duplicate PAT path is the one
that actually matters for hot-path DB load.
Three changes:
1. New auth.DaemonTokenCache mirrors PATCache for the mdt_ path
(key = mul:auth:daemon:<sha256>, JSON value = {workspace_id, daemon_id}).
Forward-looking infrastructure for when daemon tokens get minted; the
middleware short-circuits the DB SELECT on cache hit. TTL clamped to
the token's expires_at via the shared TTLForExpiry helper.
2. DaemonAuth now also consults PATCache on its mul_ fallback, sharing
the same cache as the regular Auth middleware. A daemon making 4 hb/min
collapses from 4 GetPersonalAccessTokenByHash + 4 last_used_at writes
per minute to ~1 of each per AuthCacheTTL window (~10 minutes).
3. Rename PATCacheTTL → AuthCacheTTL and bump from 60s to 10 minutes.
The constant is now shared between PAT and daemon caches; 10m matches
the user-requested longer TTL for further DB write reduction. Revoke
latency on the happy path is still instant via active invalidation;
the worst-case (Redis Del miss / direct-DB revoke) grows from ~60s to
~10m.
Tests cover nil-safety, set/get/invalidate, TTL, clamped TTL on near-
expiry tokens, and the middleware short-circuit for both cache paths
(mdt_ via DaemonTokenCache, mul_ fallback via PATCache).
* feat(server/auth): cache PAT lookups on the WebSocket auth path
The third place a PAT is resolved — patResolver.ResolveToken used by
realtime.HandleWebSocket — was still hitting Postgres on every /ws
auth and firing an unconditional last_used_at UPDATE, bypassing the
cache added in 1cdd674c. Wire it through the same shared PATCache so
revoking a token through any path (Auth middleware, DaemonAuth PAT
fallback, or WS auth) hits all three caches with one Invalidate.
Also leaves a comment on DeleteDaemonTokensByWorkspaceAndDaemon —
the query has no caller today, but a future deregister/rotate flow
must remember to call DaemonTokenCache.Invalidate(hash) for each
deleted row, otherwise deleted daemon tokens stay valid until TTL.
94 lines
2.6 KiB
Go
94 lines
2.6 KiB
Go
package auth
|
|
|
|
import (
|
|
"context"
|
|
"testing"
|
|
"time"
|
|
)
|
|
|
|
func TestDaemonTokenCache_NilSafe(t *testing.T) {
|
|
var c *DaemonTokenCache // nil
|
|
ctx := context.Background()
|
|
|
|
if id, ok := c.Get(ctx, "any-hash"); ok || id != (DaemonTokenIdentity{}) {
|
|
t.Fatalf("nil cache must miss; got (%+v, %v)", id, ok)
|
|
}
|
|
c.Set(ctx, "any-hash", DaemonTokenIdentity{WorkspaceID: "w", DaemonID: "d"}, AuthCacheTTL)
|
|
c.Invalidate(ctx, "any-hash")
|
|
}
|
|
|
|
func TestNewDaemonTokenCache_NilRedisReturnsNil(t *testing.T) {
|
|
if c := NewDaemonTokenCache(nil); c != nil {
|
|
t.Fatalf("NewDaemonTokenCache(nil) must return nil, got %#v", c)
|
|
}
|
|
}
|
|
|
|
func TestDaemonTokenCache_SetGetInvalidate(t *testing.T) {
|
|
rdb := newRedisTestClient(t)
|
|
c := NewDaemonTokenCache(rdb)
|
|
if c == nil {
|
|
t.Fatal("NewDaemonTokenCache returned nil")
|
|
}
|
|
ctx := context.Background()
|
|
|
|
if _, ok := c.Get(ctx, "missing"); ok {
|
|
t.Fatal("expected miss before set")
|
|
}
|
|
|
|
want := DaemonTokenIdentity{WorkspaceID: "ws-uuid", DaemonID: "daemon-1"}
|
|
c.Set(ctx, "hash-D", want, AuthCacheTTL)
|
|
if got, ok := c.Get(ctx, "hash-D"); !ok || got != want {
|
|
t.Fatalf("expected hit %+v, got (%+v, %v)", want, got, ok)
|
|
}
|
|
|
|
c.Invalidate(ctx, "hash-D")
|
|
if _, ok := c.Get(ctx, "hash-D"); ok {
|
|
t.Fatal("expected miss after invalidate")
|
|
}
|
|
}
|
|
|
|
func TestDaemonTokenCache_TTL(t *testing.T) {
|
|
rdb := newRedisTestClient(t)
|
|
c := NewDaemonTokenCache(rdb)
|
|
if c == nil {
|
|
t.Fatal("NewDaemonTokenCache returned nil")
|
|
}
|
|
ctx := context.Background()
|
|
|
|
c.Set(ctx, "hash-T", DaemonTokenIdentity{WorkspaceID: "w", DaemonID: "d"}, AuthCacheTTL)
|
|
ttl, err := rdb.TTL(ctx, daemonTokenCacheKey("hash-T")).Result()
|
|
if err != nil {
|
|
t.Fatalf("TTL: %v", err)
|
|
}
|
|
if ttl <= 0 || ttl > AuthCacheTTL+time.Second {
|
|
t.Fatalf("unexpected TTL %v (want ~%v)", ttl, AuthCacheTTL)
|
|
}
|
|
}
|
|
|
|
func TestDaemonTokenCache_Set_RespectsClampedTTL(t *testing.T) {
|
|
rdb := newRedisTestClient(t)
|
|
c := NewDaemonTokenCache(rdb)
|
|
if c == nil {
|
|
t.Fatal("NewDaemonTokenCache returned nil")
|
|
}
|
|
ctx := context.Background()
|
|
|
|
c.Set(ctx, "hash-short", DaemonTokenIdentity{WorkspaceID: "w", DaemonID: "d"}, 5*time.Second)
|
|
ttl, err := rdb.TTL(ctx, daemonTokenCacheKey("hash-short")).Result()
|
|
if err != nil {
|
|
t.Fatalf("TTL: %v", err)
|
|
}
|
|
if ttl <= 0 || ttl > 5*time.Second+time.Second {
|
|
t.Fatalf("expected clamped TTL ~5s, got %v", ttl)
|
|
}
|
|
|
|
c.Set(ctx, "hash-zero", DaemonTokenIdentity{WorkspaceID: "w", DaemonID: "d"}, 0)
|
|
if _, ok := c.Get(ctx, "hash-zero"); ok {
|
|
t.Fatal("zero-TTL Set must not cache")
|
|
}
|
|
c.Set(ctx, "hash-neg", DaemonTokenIdentity{WorkspaceID: "w", DaemonID: "d"}, -time.Second)
|
|
if _, ok := c.Get(ctx, "hash-neg"); ok {
|
|
t.Fatal("negative-TTL Set must not cache")
|
|
}
|
|
}
|