mirror of
https://github.com/multica-ai/multica.git
synced 2026-07-05 13:29:44 +02:00
* feat(autopilot): auto-pause autopilots with sustained high failure rate Adds a background monitor that pauses any active autopilot whose recent runs are dominated by failures (defaults: ≥100 terminal runs in 7d, ≥90% failed). The monitor leaves a severity=attention inbox notification for the autopilot's creator (or the agent's owner if the autopilot was agent-created) so a human learns about the auto-pause and can fix the root cause before re-enabling. Motivated by MUL-1336 §6 #2: a single broken cron autopilot (`Registro de ls cada 5 min`, 1,475/1,476 failed in 7d) was burning ~1.5k tasks/tokens per week with no human in the loop. Tunable via AUTOPILOT_FAIL_MONITOR_{INTERVAL,LOOKBACK,MIN_RUNS,FAIL_RATIO,STARTUP_DELAY}; INTERVAL=0 disables the monitor entirely. Co-authored-by: multica-agent <github@multica.ai> * chore(autopilot): relax failure monitor defaults to daily / 50 runs Per review feedback in MUL-1339: 30-min scan was overkill — the 50-run threshold already provides multi-hour lag, and operational simplicity matters. Lowering MinRuns from 100 → 50 keeps low-frequency autopilots in scope (~7 runs/day reaches threshold within 7d window). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>