mirror of
https://github.com/multica-ai/multica.git
synced 2026-06-16 19:29:26 +02:00
fix(timezone): harden hourly-rollup rollout against straight-through migrate MUL-2488 (#2998)
* fix(timezone): harden hourly-rollup rollout against straight-through migrate MUL-2488 PR #2968 introduced the new task_usage_hourly rollup but assumed operators would stop migrate between 102 and 103 to run the one-shot cmd/backfill_task_usage_hourly. Two pieces made that unsafe in practice: 1. The Dockerfile only shipped server / multica / migrate, so a deployed container has no backfill binary to run between phases. 2. cmd/migrate has no per-version stop, and entrypoint.sh runs `migrate up` to the latest version, so 103 silently drops the legacy daily rollups even when nobody ran the backfill — leaving usage dashboards at zero despite source data being intact in task_usage. Changes: - Build cmd/backfill_task_usage_hourly into the runtime image alongside the other binaries so operators can `docker exec` the backfill instead of needing a source checkout. - Add a fail-closed plpgsql guard at the top of migration 103 that aborts the migration when task_usage has rows but task_usage_hourly is empty. Fresh databases (no task_usage rows) are exempt because the new triggers from 102 will populate the hourly table on the first event. Already-applied databases are unaffected — schema_migrations tracks by version only, so 103 is not re-run. Co-authored-by: multica-agent <github@multica.ai> * fix(timezone): use watermark coverage for hourly-rollup guard The previous check only required `task_usage_hourly` to be non-empty, which an interrupted backfill or a manual `rollup_task_usage_hourly_window` call both satisfy. The completion signal we actually trust is `task_usage_hourly_rollup_state.watermark_at` — backfill only stamps it to `now() - 5 min` after every monthly slice succeeded, and the cron worker only advances it on a real tick. Default after migration 101 is `1970-01-01`, so an unrun or partial backfill is trivially detected. Also corrects the comment about fresh-install behavior: the triggers in 102 only enqueue dirty keys for agent_task_queue / issue / task_usage DELETE — they do not write hourly rows. INSERT/UPDATE flows through the `updated_at` watermark window of `rollup_task_usage_hourly()`, which only runs once the operator registers it as a pg_cron job. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>
This commit is contained in:
@@ -18,6 +18,7 @@ ARG COMMIT=unknown
|
||||
RUN cd server && CGO_ENABLED=0 go build -ldflags "-s -w -X main.version=${VERSION} -X main.commit=${COMMIT}" -o bin/server ./cmd/server
|
||||
RUN cd server && CGO_ENABLED=0 go build -ldflags "-s -w -X main.version=${VERSION} -X main.commit=${COMMIT}" -o bin/multica ./cmd/multica
|
||||
RUN cd server && CGO_ENABLED=0 go build -ldflags "-s -w" -o bin/migrate ./cmd/migrate
|
||||
RUN cd server && CGO_ENABLED=0 go build -ldflags "-s -w" -o bin/backfill_task_usage_hourly ./cmd/backfill_task_usage_hourly
|
||||
|
||||
# --- Runtime stage ---
|
||||
FROM alpine:3.21
|
||||
@@ -29,6 +30,7 @@ WORKDIR /app
|
||||
COPY --from=builder /src/server/bin/server .
|
||||
COPY --from=builder /src/server/bin/multica .
|
||||
COPY --from=builder /src/server/bin/migrate .
|
||||
COPY --from=builder /src/server/bin/backfill_task_usage_hourly .
|
||||
COPY server/migrations/ ./migrations/
|
||||
COPY docker/entrypoint.sh .
|
||||
RUN sed -i 's/\r$//' entrypoint.sh && chmod +x entrypoint.sh
|
||||
|
||||
Reference in New Issue
Block a user