multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 21:39:54 +02:00

Files

Bohan Jiang 6d9ebb0fdd fix(daemon): unblock issues stuck on a poisoned-image agent session (#2314 )

* fix(daemon): treat upstream API 400 invalid_request_error as poisoned session

A markdown-linked image in an issue description that the agent downloads as
a tiny CDN auth-error file and Read's as a PNG poisons the conversation:
the LLM API rejects the bad image with 400 invalid_request_error, the
session_id is pinned mid-flight, and every follow-up task on the issue
(comment-trigger, auto-retry) resumes the same poisoned conversation and
hits the same 400 — the issue can no longer be executed even after the
description is cleaned up.

Mirror the existing fallback-output classifier on the error side: detect
"API Error: ... 400 ... invalid_request_error" in the agent error string,
persist failure_reason='api_invalid_request', and add it to the
GetLastTaskSession exclusion list so the next task starts a fresh
session that re-reads the (now-clean) description.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): unblock issues already poisoned by API 400 invalid_request_error

The forward-only classifier from the previous commit only tags new failures.
Issues like MUL-1918 already have multiple failed-task rows whose
failure_reason is the pre-fix default 'agent_error', and GetLastTaskSession
falls back to those legacy rows on the next claim — so deploying the
classifier alone leaves existing poisoned issues stuck (GPT-Boy review
on PR #2314).

Two complementary changes:

- Migration 079 backfills failure_reason='api_invalid_request' on every
  pre-existing 'agent_error' row whose error text matches the canonical
  Anthropic 400 invalid_request_error shape. Keeps observability
  consistent (multica issue runs / UI now report the right reason).

- GetLastTaskSession adds a defensive ILIKE clause on error text. Closes
  the deploy-window gap where the old binary could write a new
  'agent_error' row between the migration running and the new code
  taking over, and protects against future error-format variants the
  daemon classifier might miss.

Plus regression tests covering the legacy + new coexistence case GPT-Boy
flagged, and a guard rail asserting benign 'agent_error' failures
(timeouts, tool errors) still resume their session.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

2026-05-09 14:39:10 +08:00

generated

fix(daemon): unblock issues stuck on a poisoned-image agent session (#2314 )

2026-05-09 14:39:10 +08:00

queries

fix(daemon): unblock issues stuck on a poisoned-image agent session (#2314 )

2026-05-09 14:39:10 +08:00