multica/server/migrations/066_force_fresh_session.down.sql at bf1e375015f0bb6e2f736eaeb53ee1fa5cd84bdd - multica - Gitea: Git with a cup of tea

highperfocused/multica

mirror of https://github.com/multica-ai/multica.git synced 2026-06-17 11:48:42 +02:00

Files

Bohan Jiang b1345685a3 fix(task): rerun starts a fresh session, skip poisoned resume (#1928 )

* fix(task): rerun starts a fresh session, skip poisoned resume

When a task ended in a known agent fallback ("I reached the iteration
limit and couldn't generate a summary.", "Put your final update inside
the content string. Keep it concise.") the (agent_id, issue_id) resume
lookup would still pick that session, so a manual rerun inherited the
poisoned state and reproduced the same bad output.

Two complementary guards:

1. Daemon classifies poisoned terminal output and routes it through the
   blocked path with failure_reason set ('iteration_limit' /
   'agent_fallback_message'). GetLastTaskSession excludes failed tasks
   with those reasons, so even comment-triggered tasks no longer resume
   them. Tasks that failed mid-flight (timeout, runtime_recovery, etc.)
   are still resumable, preserving MUL-1128's auto-retry contract.

2. Manual rerun marks the new task force_fresh_session=true. The daemon
   claim handler skips the resume lookup entirely when the flag is set,
   capturing the user-intent signal that "the prior output was bad" even
   when poisoned classification misses a future fallback wording.

Auto-retry of orphaned mid-flight failures (MaybeRetryFailedTask →
CreateRetryTask) does not take this path, so it keeps resuming.

Tests: classifyPoisonedOutput unit test; integration tests assert the
SQL filter excludes poisoned classifiers, RerunIssue flips the flag,
and the normal enqueue path leaves it false.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): cap poisoned-output matcher to short trimmed text

GPT-Boy review on MUL-1630: the previous strings.Contains match would
classify any output that quoted the marker substring — including a
review/analysis that simply discussed the marker itself. Real fallback
messages are short single-sentence affairs, so cap the candidate at
~one paragraph and trim whitespace before matching. Adds regression
tests covering a long quoting review and a marker buried in a long
real conclusion; both must stay classified as completed.

Co-authored-by: multica-agent <github@multica.ai>

* fix(migrations): rename 065 force_fresh_session → 066 to clear collision

main introduced 065_project_resources after this branch was cut, so
both files shared the 065_ prefix. The readiness check
(server/cmd/server/health.go → migrations.LatestVersion) takes the
last entry by lexical order, which is 065_project_resources, leaving
this branch's 065_force_fresh_session unguarded — a deploy that
applied project_resources but not force_fresh_session would still
report ready, and the next enqueue / rerun / claim would crash on
"column force_fresh_session does not exist".

Renaming to 066_force_fresh_session puts it strictly after
project_resources so readiness blocks until it's applied.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

2026-04-30 14:17:53 +08:00

2 lines

62 B

SQL

Raw Blame History

ALTER TABLE agent_task_queue DROP COLUMN force_fresh_session;