* test(migrate): add concurrent migration race test using real Postgres (MUL-2956)
Follow-up to MUL-2923 / #3658, which added a Postgres advisory lock to
serialize the migration loop across concurrent runners (multi-replica
backend startup, scale-up, manual `migrate up` overlap). That PR shipped
without a test because cmd/migrate/ had no harness; this commit adds it.
Refactor: extract runMigrations(ctx, pool, runOptions) from main(), with
the lock key, the bookkeeping table, and the file list now injectable.
main() behavior is unchanged. Identifier interpolation goes through
pgx.Identifier{}.Sanitize so callers can pass "schema.schema_migrations"
safely.
Tests (cmd/migrate/migrate_concurrent_test.go) — every case isolates
itself in a unique throwaway schema and a unique lock key, so they
never touch the real schema_migrations table or block real production
runners that share the database. Skip cleanly when DATABASE_URL is
unreachable, matching the pattern already used in
internal/handler/handler_test.go and internal/metrics/business_sampler_pgsleep_test.go.
- TestRunMigrationsConcurrentPending: 16 goroutines apply 5
deliberately non-idempotent migrations (bare CREATE TABLE +
ALTER TABLE ADD COLUMN). Without the lock, concurrent CREATE TABLE
races trip "duplicate key value violates unique constraint
pg_type_typname_nsp_index" — proving the lock is doing its job.
- TestRunMigrationsConcurrentAlreadyApplied: 16 goroutines hit the
EXISTS no-op path against a pre-populated bookkeeping table; the
state must be unchanged.
- TestRunMigrationsAdvisoryLockSerializes: an external connection
holds the same advisory lock; we assert that zero of the 16
runners complete during a 1 s observation window, then release
the side lock and let them all finish. Catches the original
MUL-2923 bug where the lock got attached to a random pooled
connection.
- TestRunMigrationsConcurrentMixedPoolStress: same pending case but
with a deliberately small pool (runners/2), forcing pgxpool.Acquire
contention to overlap with pg_advisory_lock contention.
Verified locally: `go test -race -count=10 ./cmd/migrate/` passes in
~15 s. Mutation test (lock acquire/release replaced with `SELECT 1`)
confirms the pending and lock-serializes tests both fail loudly,
catching the regression they were written to detect.
go.mod tidy promotes golang.org/x/sync to a direct dependency
(now imported by the test for errgroup) and incidentally fixes a
stale `// indirect` annotation on prometheus/client_model, which is
already imported directly by internal/metrics/testutil.go.
Co-authored-by: multica-agent <github@multica.ai>
* test(migrate): gofmt + address review nits (MUL-2956)
- gofmt -w cmd/migrate/migrate_concurrent_test.go: fixture struct field
alignment.
- quoteQualifiedIdentifier: actually reject identifiers with more than
one dot (the previous version split on the first dot only and would
silently sanitize "a.b.c" into "a"."b.c", contradicting the comment).
Inline the splitter via strings.Split now that we explicitly check the
component count.
- Soften the test's lock-key comment from "never collide" to the
accurate probabilistic statement (~1 in 2^62 collision odds with the
production constant).
go test -race -count=10 ./cmd/migrate/ still passes (~15 s).
Co-authored-by: multica-agent <github@multica.ai>
* test(migrate): direction whitelist + tidy go.mod (MUL-2956)
Address two follow-ups from review:
- runMigrations now whitelist-checks opts.Direction up-front and
returns an error for anything that is not "up" or "down". The
previous shape relied on `opts.Direction == "up"` and an else branch,
so a typo or empty string would silently fall through to the
rollback path. Add TestRunMigrationsRejectsInvalidDirection covering
the empty string, "UP"/"DOWN" case mismatches, "rollback", and a
whitespace-padded value; the check fires before any pool work, so
the test runs without Postgres.
- go mod tidy: promotes google.golang.org/protobuf to a direct
dependency (it is imported directly elsewhere in the module and was
stale-marked indirect).
go test -race -count=10 ./cmd/migrate/ green (~15.7 s, 50/50).
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: wei-heshang <wei-heshang@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
cmd/migrate previously ran a check-then-apply loop on a *pgxpool.Pool
with no locking, so two backend pods starting at the same time (multi-
replica Deployment, scale-up, or a manual run overlapping with pod
startup) could both pass the EXISTS check on a pending migration and
race on the DDL or the schema_migrations INSERT, crashing the loser.
Take a single connection from the pool, hold a session-level
pg_advisory_lock for the entire migration loop, and release it on the
way out. We use the blocking variant so a late arriver queues behind
the current runner and then no-ops on the EXISTS checks instead of
crash-looping. The loop deliberately stays outside a transaction so
existing CREATE INDEX CONCURRENTLY migrations keep working.
Also refresh the values.yaml / backend.yaml comments next to
backend.replicas: the chart still ships replicas: 1 by default, but
that is now a recommendation (Recreate strategy, no leader split), not
a correctness requirement.
Refs https://github.com/multica-ai/multica/issues/3647
Co-authored-by: multica-agent <github@multica.ai>
Replace raw fmt/log calls with structured slog logger (Go) and
console-based logger (TypeScript). Add request logging middleware.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add idempotent seed tool with duplicate detection for agents/issues/comments
- Add migration CLI supporting up/down with schema_migrations tracking
- Add Makefile targets: make setup (first-time), make start, make stop
- Update .gitignore for test artifacts and compiled binaries
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>