mirror of
https://github.com/multica-ai/multica.git
synced 2026-06-17 11:48:42 +02:00
* feat(daemon): harden agent mention-loop instructions Two agents that mention each other via `mention://agent/<id>` can fall into an infinite reply loop — each says "I'm done" in prose but keeps `@mentioning` the other, which re-enqueues their run. Adding hard caps on agent-to-agent turns conflicts with Multica's design principle of giving agents the same authorship freedom as humans, so this change hardens the instructions that the harness injects instead. - Replace the terse "mentions are actions" blurb with a full Mentions protocol: `side-effecting` warning, explicit "when NOT to mention" (replying to another agent, sign-offs, thanks) and "when a mention IS appropriate" (human escalation, first-time delegation, user asked). - Add a pre-workflow decision step for comment-triggered runs: decide whether a reply is warranted at all, decide whether to include any `@mention`, and clarify that the post-a-comment rule is mandatory *if* you reply — silence is a valid exit for agent-to-agent threads. - Thread the triggering comment's author kind + display name (`TriggerAuthorType` / `TriggerAuthorName`) from the claim endpoint through the daemon task type, per-turn prompt, and CLAUDE.md workflow. When the author is another agent, both surfaces now name that agent and warn against sign-off mentions. - Soften the old closing line that told agents to `always` use the mention format — the word generalized to member/agent mentions and encouraged the very behavior that causes loops. Refs GH#1576, MUL-1323. * fix(daemon): remove MUST-respond conflict and sanitize trigger author name Addresses two blocking points on PR #1581: 1. buildCommentPrompt told the agent "You MUST respond to THIS comment" and unconditionally appended the reply command — directly conflicting with the new agent-to-agent silence-as-valid-exit workflow. Models were likely to keep following the older must-reply rule and fall back into the loop this PR is trying to close. Rewrite the header as "Focus on THIS comment — do not confuse it with previous ones" (keeps the anti-stale-comment signal) and change BuildCommentReplyInstructions to open with "If you decide to reply, post it by running exactly this command" so the reply command is available but conditional across both prompt surfaces. 2. Raw agent/user display names were being embedded directly into the high-priority prompt and CLAUDE.md via TriggerAuthorName. Agent and member names are only validated as non-empty at write time, so a name containing newlines, backticks, or fake mention markup would turn the field into a cross-agent prompt-injection surface. Add execenv.SanitizePromptField — strip control runes, collapse whitespace, drop markdown structural characters (backtick, asterisk, brackets, pipe, angle brackets, hash, backslash), truncate to 64 runes — and apply it at both embed sites (per-turn prompt and CLAUDE.md). Defense-in-depth at the consumption layer so this works for already-stored names without a migration. Tests: TestSanitizePromptField covers the policy; TestBuildPromptSanitizesAgentName plants an attack payload in TriggerAuthorName and checks the rendered prompt does not leak the newline-anchored injection or the fake mention markup. TestBuildPromptCommentTriggered*{,ByMember} updated to lock in the conditional reply-command framing. * refactor(daemon): trim redundant CLAUDE.md preamble and drop name sanitizer Per PR #1581 feedback: 1. Remove the `if ctx.TriggerAuthorType == "agent"` preamble block in runtime_config.go. It duplicated what workflow steps 4 and 5 already say ("Decide whether a reply is warranted", "Never @mention the agent you are replying to as a thank-you or sign-off"), so the signal lands the same without the extra ~7 lines of CLAUDE.md. The per-turn prompt preamble in prompt.go stays — that surface has no numbered workflow below it and would otherwise lose the silence-as-exit signal. 2. Delete execenv.SanitizePromptField + its test. Workspace agents are created by trusted team members, so the cross-agent name-injection surface it defended isn't realistic in the current trust model. 3. Drop TriggerAuthorType/Name from execenv.TaskContextForEnv and stop populating them in daemon.go — they're no longer read by the execenv package. The same fields on daemon.Task stay because prompt.go still needs them to label the triggering author in the per-turn prompt. Tests simplified to match the leaner shape: CLAUDE.md regression guards now assert that the anti-loop phrases live in the numbered workflow, and the sanitizer-specific tests are removed.