Files
multica/server/internal/daemon/execenv/codex_multi_agent.go
Bohan Jiang bd82607645 fix(execenv): default-disable Codex native multi-agent in per-task config (#1845)
* fix(execenv): default-disable Codex native multi-agent in per-task config

Recent Codex app-server releases enable features.multi_agent by default,
exposing spawn_agent / wait / close_agent tools that let a parent thread
spawn nested subagents. The daemon currently models only the parent thread,
so the parent's turn/completed is treated as task completion even when
spawned children are still running — leading to premature task completion
and dropped child output.

Disable features.multi_agent by default in the per-task CODEX_HOME/config.toml
so Multica's task lifecycle is the only orchestration layer in play. Strip
both the dotted-key form (features.multi_agent) at TOML root and the
multi_agent key inside a [features] table; siblings and unrelated tables
are preserved. Honor MULTICA_CODEX_MULTI_AGENT=1 as an opt-out for users
who explicitly want Codex native subagents inside a Multica task.

The user's global ~/.codex/config.toml is never modified — only the daemon's
isolated per-task copy.

Also widen managedBlockRe to consume `\n*` rather than `\n?` so reruns
don't accumulate blank lines when both the sandbox and multi-agent managed
blocks coexist.

* fix(execenv): inject managed multi_agent inside existing [features] table

Per PR review (codex_multi_agent.go:77-83 vs :112-115): when the user's
config.toml already has a top-level `[features]` table, writing
`features.multi_agent = false` at the TOML root implicitly redefines the
same `features` table. The strict TOML parser used by Codex (`toml-rs`)
rejects that with `table 'features' already exists`, so Codex would fail
to load the per-task config and refuse to start the thread. Verified the
strict-parser failure with pelletier/go-toml/v2; the previous
BurntSushi/toml-based regression test was permissive enough to miss it.

Detect a root-level `[features]` header and place the managed block
inside that table (`multi_agent = false` with marker comments). When no
such header exists, keep the existing root-level dotted-key form. The
managed-block regex matches both layouts so reruns and layout
transitions stay idempotent. A `[features.experimental]` sub-table
without a bare `[features]` header still uses the root dotted-key form,
which is spec-valid (no explicit redefinition).

Tests now use pelletier/go-toml/v2 to actually parse the output and
assert features.multi_agent decodes to false; the regression case from
the PR review is covered explicitly.

* fix(execenv): recognize feature table header variants

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
2026-04-29 17:17:09 +08:00

239 lines
9.0 KiB
Go

package execenv
import (
"fmt"
"log/slog"
"os"
"regexp"
"strings"
)
// Background
//
// Recent Codex `app-server` releases enable `features.multi_agent` by
// default, exposing spawn_agent / send_input / wait / resume_agent /
// close_agent tools to the model so a Codex thread can fan out into nested
// subagents. The Multica daemon currently models only the parent Codex
// thread per task: when the parent emits `turn/completed`, the task is
// marked terminal even if spawned children are still running or have not
// flushed their output. The result is a class of premature-completion
// failures where useful child-agent work is dropped.
//
// Until either Codex exposes a "parent done but children still open"
// lifecycle state with drain/cancel primitives, or the Multica runtime
// models child threads as first-class task entities, the daemon disables
// Codex native multi-agent by default for daemon-managed task sessions.
// The override only touches the per-task `CODEX_HOME/config.toml`; the
// user's global `~/.codex/config.toml` is never modified.
//
// Users who explicitly want Codex native subagents inside a Multica task
// (and accept the lifecycle risk) can keep the feature enabled by setting
// `MULTICA_CODEX_MULTI_AGENT=1` in the daemon environment.
//
// Layout note
//
// TOML rejects redefining a table that has already been created — including
// implicitly via a dotted key. So the managed block must adapt to the
// user's existing config:
//
// - If the user's config contains a top-level `[features]` table, the
// managed `multi_agent = false` is injected INSIDE that table (with
// marker comments). Writing `features.multi_agent = false` at the
// TOML root would implicitly redefine the same `features` table and
// the strict TOML parser used by Codex (`toml-rs`) would fail with
// `table 'features' already exists`.
// - Otherwise, the managed block lives at the top of the file with the
// dotted-key form `features.multi_agent = false`.
// MulticaCodexMultiAgentEnv is the env var users can set to keep Codex
// native multi-agent enabled inside daemon-managed tasks. Anything truthy
// (1, true, yes, on; case-insensitive) keeps the feature on; everything
// else (including unset) disables it.
const MulticaCodexMultiAgentEnv = "MULTICA_CODEX_MULTI_AGENT"
// multicaMultiAgentBeginMarker / multicaMultiAgentEndMarker delimit the
// multi-agent-specific managed block. Kept separate from the sandbox
// block so each can evolve and migrate independently.
const (
multicaMultiAgentBeginMarker = "# BEGIN multica-managed multi-agent (do not edit; regenerated by daemon)"
multicaMultiAgentEndMarker = "# END multica-managed multi-agent"
)
// `\n*` rather than `\n?` so reruns don't accumulate blank lines when this
// block coexists with the sandbox managed block in the same file. The same
// regex matches the block whether it sits at the file root (dotted-key
// form) or inside a `[features]` table — only the body keys differ.
var multiAgentBlockRe = regexp.MustCompile(
`(?ms)^` + regexp.QuoteMeta(multicaMultiAgentBeginMarker) +
`.*?^` + regexp.QuoteMeta(multicaMultiAgentEndMarker) + `\n*`)
var (
// matches a top-level `[features]` table header, allowing TOML's optional
// whitespace inside brackets and inline comments after the header.
rootFeaturesTableHeaderRe = regexp.MustCompile(`^\s*\[\s*features\s*\]\s*(?:#.*)?$`)
// matches `multi_agent = ...` (with optional whitespace) inside a
// `[features]` table.
featuresTableMultiAgentRe = regexp.MustCompile(`^\s*multi_agent\s*=`)
// matches `features.multi_agent = ...` at the TOML root (top-level
// dotted-key form, including TOML's optional whitespace around dots).
rootDottedMultiAgentRe = regexp.MustCompile(`^\s*features\s*\.\s*multi_agent\s*=`)
)
// codexMultiAgentEnabled reports whether the user opted into keeping Codex
// native multi-agent on for daemon-managed tasks.
func codexMultiAgentEnabled() bool {
raw := strings.TrimSpace(os.Getenv(MulticaCodexMultiAgentEnv))
switch strings.ToLower(raw) {
case "1", "true", "yes", "on":
return true
}
return false
}
// renderMulticaMultiAgentBlock returns the daemon-managed multi-agent
// block. The body uses `multi_agent = false` when injected inside a
// `[features]` table, and `features.multi_agent = false` otherwise.
func renderMulticaMultiAgentBlock(inFeaturesTable bool) string {
var b strings.Builder
b.WriteString(multicaMultiAgentBeginMarker)
b.WriteString("\n")
if inFeaturesTable {
b.WriteString("multi_agent = false\n")
} else {
b.WriteString("features.multi_agent = false\n")
}
b.WriteString(multicaMultiAgentEndMarker)
b.WriteString("\n")
return b.String()
}
// stripUserMultiAgentDirectives removes any `features.multi_agent = ...`
// line at the TOML root (dotted-key form), plus any `multi_agent = ...`
// line that sits inside a top-level `[features]` table. Both forms encode
// the same TOML key and would conflict with the managed block.
//
// Other tables (`[features.experimental]`, `[profiles.foo]`, ...) are
// preserved untouched: they live under their own scope and don't redefine
// `features.multi_agent` at the root.
func stripUserMultiAgentDirectives(content string) string {
lines := strings.Split(content, "\n")
out := make([]string, 0, len(lines))
currentTable := "" // empty = TOML root
for _, line := range lines {
trimmed := strings.TrimSpace(line)
if rootFeaturesTableHeaderRe.MatchString(line) {
currentTable = "[features]"
out = append(out, line)
continue
}
if strings.HasPrefix(trimmed, "[") {
currentTable = trimmed
out = append(out, line)
continue
}
switch currentTable {
case "":
if rootDottedMultiAgentRe.MatchString(trimmed) {
continue
}
case "[features]":
if featuresTableMultiAgentRe.MatchString(trimmed) {
continue
}
}
out = append(out, line)
}
return strings.Join(out, "\n")
}
// hasRootFeaturesTable reports whether the file contains a top-level
// `[features]` table header. Sub-tables like `[features.experimental]` do
// NOT count: they implicitly create `features` but don't conflict with a
// root-level `features.multi_agent` dotted key.
func hasRootFeaturesTable(content string) bool {
for _, line := range strings.Split(content, "\n") {
if rootFeaturesTableHeaderRe.MatchString(line) {
return true
}
}
return false
}
// injectManagedBlockIntoFeaturesTable inserts the in-table managed block
// immediately after the first `[features]` header line. Caller must have
// already stripped any prior managed block and any user-set `multi_agent`
// directive from inside the table.
func injectManagedBlockIntoFeaturesTable(content string) string {
block := renderMulticaMultiAgentBlock(true)
// Drop the trailing `\n` so we don't introduce a stray blank line when
// splicing block lines between existing lines.
blockLines := strings.Split(strings.TrimRight(block, "\n"), "\n")
lines := strings.Split(content, "\n")
for i, line := range lines {
if !rootFeaturesTableHeaderRe.MatchString(line) {
continue
}
out := make([]string, 0, len(lines)+len(blockLines))
out = append(out, lines[:i+1]...)
out = append(out, blockLines...)
out = append(out, lines[i+1:]...)
return strings.Join(out, "\n")
}
return content
}
// ensureCodexMultiAgentConfig writes the daemon-managed multi-agent block
// into the per-task config.toml so Codex native subagents stay disabled.
// Idempotent: running it twice produces the same file.
//
// When MULTICA_CODEX_MULTI_AGENT is set to a truthy value, the function is
// a no-op — the user has explicitly opted into Codex native subagents and
// accepts the lifecycle risk. Toggling the env var across prepare runs is
// not supported: the per-task config is short-lived (recreated per task),
// so users should set the var once at daemon start.
func ensureCodexMultiAgentConfig(configPath string, logger *slog.Logger) error {
if codexMultiAgentEnabled() {
if logger != nil {
logger.Info("codex multi-agent: leaving Codex native multi-agent untouched per MULTICA_CODEX_MULTI_AGENT",
"config_path", configPath,
)
}
return nil
}
data, err := os.ReadFile(configPath)
if err != nil && !os.IsNotExist(err) {
return fmt.Errorf("read config.toml: %w", err)
}
existing := string(data)
// Always strip any previously written managed block (root or in-table
// form) so reruns and layout transitions stay clean.
existing = multiAgentBlockRe.ReplaceAllString(existing, "")
// Strip user-set directives in both encodings; the managed block re-adds
// the canonical form below.
existing = stripUserMultiAgentDirectives(existing)
var updated string
if hasRootFeaturesTable(existing) {
updated = injectManagedBlockIntoFeaturesTable(existing)
} else {
existing = strings.TrimLeft(existing, "\n")
block := renderMulticaMultiAgentBlock(false)
if existing == "" {
updated = block
} else {
updated = block + "\n" + existing
}
}
if updated == string(data) {
return nil
}
if err := os.WriteFile(configPath, []byte(updated), 0o644); err != nil {
return fmt.Errorf("write config.toml: %w", err)
}
return nil
}