Files
multica/server/cmd
Multica Eve 8ad673fdb7 MUL-3560: gate slim runtime brief behind runtime_brief_slim feature flag (#4449)
The MUL-3560 slim runtime brief — kind-driven dispatcher, per-section
gating, prose compression for ~7k chars saved on the typical
comment-triggered task — now ships behind the `runtime_brief_slim`
feature flag wired via the framework-level service from MUL-3615.

Default: OFF in every environment (production stays on the legacy
brief that has shipped for ~2 years). Staging opts in via the YAML
rule set; ops can override per-process with `FF_RUNTIME_BRIEF_SLIM=true`.
Production is held back until staging has burned in long enough that
we are confident the slim brief does not regress agent behaviour.

Architecture (one toggle point, two code paths, both fully tested):

  buildMetaSkillContent (runtime_config.go)
      │
      └─ useSlimBrief() → false (default)
      │     → fall through to the legacy verbose body that ships on
      │       main today — byte-for-byte unchanged, no migration risk
      │
      └─ useSlimBrief() → true
            → buildMetaSkillContentSlim (runtime_config_sections.go)
              → classifyTask → 5-way kind switch → per-section writers

BuildCommentReplyInstructions takes the same gate, so the per-turn
comment prompt and the runtime brief stay in sync on which template
they emit.

What's in this PR:

- runtime_config_flag.go (new): package-scope `runtimeFlags` atomic
  pointer + `SetFeatureFlags` setter + `useSlimBrief` toggle point.
  Nil-safe: a daemon that forgets to wire the service falls back to
  legacy, no panic.
- runtime_config_kind.go (new): `taskKind` enum + `classifyTask` +
  `hasIssueContext` predicate. Used only by the slim path.
- runtime_config_sections.go (new): the slim brief itself —
  `buildMetaSkillContentSlim` + per-section `writeXxx` helpers
  + `writeAvailableCommandsQuickCreate` minimal variant +
  `writeBackgroundTaskSafetySlim` compressed safety section. The
  Section × Kind matrix is documented inline on
  `buildMetaSkillContentSlim` and the test below checks the
  dispatcher does not diverge from the spec.
- reply_instructions.go: `BuildCommentReplyInstructions` gains a
  short slim-or-legacy prelude; new `buildCommentReplyInstructionsSlim`
  is the compressed cookbook (defers the shell-hazard rationale to
  `## Comment Formatting`).
- runtime_config.go: `buildMetaSkillContent` gains a 2-line
  dispatcher at the top; the legacy body is otherwise untouched.
- runtime_config_kind_test.go (new): canaries for both paths.
  - TestClassifyTask: 5 kinds + 3 tiebreak cases.
  - TestTaskKindHasIssueContext: predicate semantics.
  - TestSlimFlagOffUsesLegacy: nil flag service → legacy path
    (renders "Get full issue details.", a legacy-only substring).
  - TestSlimFlagOnUsesSlim: flag on → slim path (renders "full
    issue.", a slim-only one-liner) AND must NOT render legacy
    "Get full issue details.".
  - TestBuildMetaSkillContentSlimKindMatrix: locks the per-kind
    section set; heading match is line-anchored so inline references
    don't trip absence assertions.
  - TestSlimQuickCreateAvailableCommands: locks the minimal-variant
    content for quick-create (issue create present, every other
    Core command absent).
  - TestSlimBriefIsSubstantiallyShorter: ≥ 30% reduction guard so
    a future change can't accidentally re-bloat the slim path back
    to legacy levels.
- cmd/server/main.go: now calls `execenv.SetFeatureFlags(flags)`
  immediately after constructing the feature flag service.

Measured impact (slim vs legacy, claude provider, realistic fixture
with 2 repos + 2 skills + member initiator):

    legacy = 19567 chars
    slim   = 11868 chars    Δ = -7699  (-39.3%)

Verification:

- go vet ./internal/daemon/... ./cmd/server/...                  ok
- go test ./internal/daemon/...                                  ok
- go test ./pkg/featureflag/...                                  ok
- TestSlimBriefIsSubstantiallyShorter logs the 39.3% ratio
- TestSlimFlagOffUsesLegacy + TestSlimFlagOnUsesSlim pass both
  directions, so the dispatcher is locked in code.

The pre-existing `internal/handler` test failures
(TestLeaveWorkspace_RevokesOwnRuntimes,
TestDeleteMember_CancelsTasksFromAgentReassignment,
TestDeleteMember_NoRuntimes_DeletesMember) reproduce on plain
`origin/main` with the same `relation "channel_user_binding" does
not exist` SQL error — they are a missing-migration bug from the
recent channels foundation PR (ce28d0aa0), not anything this PR
touched.

Rollout plan:

1. Merge this PR. Production daemons keep emitting the legacy brief
   (flag default false).
2. Add a YAML rule to staging's
   `MULTICA_FEATURE_FLAGS_FILE`:

       runtime_brief_slim:
         default: true

   Staging daemons start emitting the slim brief on next restart.
3. Watch `agent prompt prepared` logs + agent behaviour for 7 days.
4. If staging is clean, flip the prod YAML to `default: true`.
   Legacy code path stays in the binary as a kill-switch
   (`FF_RUNTIME_BRIEF_SLIM=false` to revert without a deploy).
5. After ~30 days clean in prod, follow up with a PR that deletes
   the legacy body and the flag — same pattern as docs/feature-flags.md
   recommends ("plan the death of the flag at birth").

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 14:23:17 +08:00
..