Compare commits

..

1 Commits

Author SHA1 Message Date
Jiang Bohan
55cfaa1e1d fix(daemon/execenv): make posting result comment an explicit workflow step
Agents were silently finishing tasks without ever posting results to the
issue — their final reply stayed in terminal/log output only. See MUL-1124.

Root cause: the injected CLAUDE.md / AGENTS.md put "post a comment with
results" inside the body of step 4 (a nested clause in the default workflow
description), so skill-driven flows jumped straight from "do the work" to
`status in_review`.

- Hoist posting the result comment into its own explicit, numbered step in
  both assignment-triggered and comment-triggered workflows, with the exact
  `multica issue comment add` invocation inlined.
- Add a hard warning at the top of the Output section that terminal / chat
  text is never delivered to the user.
- Add regression test covering both workflow branches.
2026-04-20 16:51:41 +08:00
7 changed files with 79 additions and 239 deletions

View File

@@ -4,11 +4,6 @@ POSTGRES_USER=multica
POSTGRES_PASSWORD=multica
POSTGRES_PORT=5432
DATABASE_URL=postgres://multica:multica@localhost:5432/multica?sslmode=disable
# Optional pgxpool tuning. Defaults are 25 / 5 per pod and are usually fine.
# You can also set pool_max_conns / pool_min_conns as query params on
# DATABASE_URL; env vars below take precedence over URL params.
# DATABASE_MAX_CONNS=25
# DATABASE_MIN_CONNS=5
# Server
# APP_ENV gates dev-only auth shortcuts (primarily the 888888 master code).

View File

@@ -14,15 +14,6 @@ All configuration is done via environment variables. Copy `.env.example` as a st
| `JWT_SECRET` | **Must change from default.** Secret key for signing JWT tokens. Use a long random string. | `openssl rand -hex 32` |
| `FRONTEND_ORIGIN` | URL where the frontend is served (used for CORS) | `https://app.example.com` |
### Database Pool Tuning (Optional)
These have sensible defaults and only need to be set when tuning a large or constrained deployment. Precedence (highest first): env var → `pool_*` query params on `DATABASE_URL` → built-in default.
| Variable | Description | Default |
|----------|-------------|---------|
| `DATABASE_MAX_CONNS` | pgxpool max connections per pod. `pod_count × DATABASE_MAX_CONNS` should stay well below the Postgres `max_connections` ceiling. With a connection pooler (PgBouncer / RDS Proxy / Supavisor) in front, this can be raised significantly. | `25` |
| `DATABASE_MIN_CONNS` | pgxpool warm baseline connections per pod. Auto-clamped to `DATABASE_MAX_CONNS`. | `5` |
### Email (Required for Authentication)
Multica uses email-based magic link authentication via [Resend](https://resend.com).

View File

@@ -2,11 +2,7 @@ package main
import (
"context"
"fmt"
"log/slog"
"net/url"
"os"
"strconv"
"time"
"github.com/jackc/pgx/v5/pgxpool"
@@ -17,104 +13,8 @@ const (
// 15s lines up with the daemon heartbeat cadence so it's easy to
// correlate with traffic patterns in the prod logs.
dbStatsInterval = 15 * time.Second
// defaultMaxConns / defaultMinConns are the per-pod pgxpool sizing
// defaults. They replace pgx's built-in default of max(4, NumCPU),
// which is far too small for our daemon-poll traffic pattern (~3800
// acquires/s observed in prod) and was the root cause of the 3s+
// /tasks/claim tail latency.
//
// The numbers follow the conventional "small pool, lots of waiters"
// guidance for Postgres (HikariCP / PG community formula
// `(core_count * 2) + effective_spindle_count`): 25 leaves headroom
// for bursts and the occasional long-running query while staying well
// below typical managed-Postgres `max_connections` ceilings when
// multiplied across pods. MinConns=5 keeps a warm baseline so cold
// pods don't pay handshake cost on first traffic.
//
// Both values are overridable via DATABASE_MAX_CONNS / DATABASE_MIN_CONNS.
defaultMaxConns int32 = 25
defaultMinConns int32 = 5
)
// newDBPool builds a pgxpool with sane production defaults and env overrides.
//
// pgxpool.New(ctx, url) — used previously — silently picks MaxConns =
// max(4, NumCPU). On our prod pods (small CPU request) that resolved to 4,
// which got fully saturated by the daemon claim/heartbeat traffic and showed
// up as ~900ms acquire waits on every query.
//
// Configuration precedence (highest first):
// 1. DATABASE_MAX_CONNS / DATABASE_MIN_CONNS env vars
// 2. pool_max_conns / pool_min_conns query params on DATABASE_URL
// (honored natively by pgxpool.ParseConfig)
// 3. The defaults defined here (defaultMaxConns / defaultMinConns)
//
// pgx's own built-in default (max(4, NumCPU)) is intentionally NOT used as a
// fallback — it is the value that caused the prod incident.
func newDBPool(ctx context.Context, dbURL string) (*pgxpool.Pool, error) {
cfg, err := pgxpool.ParseConfig(dbURL)
if err != nil {
return nil, fmt.Errorf("parse database url: %w", err)
}
urlParams := poolParamsFromURL(dbURL)
// Compute the non-env fallback first: honor URL pool_* params if the
// operator set them, otherwise use our code default. This fallback is
// also what an *invalid* env value falls back to — never pgx's built-in
// default of 4/0, which is the value that caused the prod incident.
maxFallback := defaultMaxConns
if urlParams["pool_max_conns"] {
maxFallback = cfg.MaxConns
}
cfg.MaxConns = envInt32("DATABASE_MAX_CONNS", maxFallback)
minFallback := defaultMinConns
if urlParams["pool_min_conns"] {
minFallback = cfg.MinConns
}
cfg.MinConns = envInt32("DATABASE_MIN_CONNS", minFallback)
if cfg.MinConns > cfg.MaxConns {
cfg.MinConns = cfg.MaxConns
}
return pgxpool.NewWithConfig(ctx, cfg)
}
// poolParamsFromURL returns the set of pool_* query params present on the
// database URL. Used to detect whether the operator already tuned the pool
// via the connection string, so env-less upgrades don't silently override
// existing configuration.
func poolParamsFromURL(dbURL string) map[string]bool {
out := map[string]bool{}
u, err := url.Parse(dbURL)
if err != nil {
return out
}
for k := range u.Query() {
out[k] = true
}
return out
}
// envInt32 reads an int32 from the named env var. Empty / invalid values fall
// back to def and emit a warn so misconfiguration is visible in startup logs.
func envInt32(name string, def int32) int32 {
raw := os.Getenv(name)
if raw == "" {
return def
}
v, err := strconv.ParseInt(raw, 10, 32)
if err != nil || v <= 0 {
slog.Warn("invalid env var, using default",
"name", name, "value", raw, "default", def, "error", err)
return def
}
return int32(v)
}
// logPoolConfig prints the effective pgxpool configuration once at startup.
// Surfacing this is critical because pgxpool defaults are surprisingly small
// (MaxConns = max(4, NumCPU)) — without seeing the value in the log it's
@@ -141,10 +41,10 @@ func runDBStatsLogger(ctx context.Context, pool *pgxpool.Pool) {
defer ticker.Stop()
var (
lastEmpty int64
lastAcquire int64
lastAcquireDur time.Duration
lastCanceled int64
lastEmpty int64
lastAcquire int64
lastAcquireDur time.Duration
lastCanceled int64
)
for {
select {

View File

@@ -1,109 +0,0 @@
package main
import (
"testing"
"github.com/jackc/pgx/v5/pgxpool"
)
// applyPoolSizing mirrors the env+URL precedence logic in newDBPool but
// without actually opening a connection, so the resolution rules can be
// asserted in unit tests.
func applyPoolSizing(t *testing.T, dbURL string, envMax, envMin string) (max, min int32) {
t.Helper()
cfg, err := pgxpool.ParseConfig(dbURL)
if err != nil {
t.Fatalf("ParseConfig: %v", err)
}
urlParams := poolParamsFromURL(dbURL)
maxFallback := defaultMaxConns
if urlParams["pool_max_conns"] {
maxFallback = cfg.MaxConns
}
if envMax != "" {
t.Setenv("DATABASE_MAX_CONNS", envMax)
}
cfg.MaxConns = envInt32("DATABASE_MAX_CONNS", maxFallback)
minFallback := defaultMinConns
if urlParams["pool_min_conns"] {
minFallback = cfg.MinConns
}
if envMin != "" {
t.Setenv("DATABASE_MIN_CONNS", envMin)
}
cfg.MinConns = envInt32("DATABASE_MIN_CONNS", minFallback)
if cfg.MinConns > cfg.MaxConns {
cfg.MinConns = cfg.MaxConns
}
return cfg.MaxConns, cfg.MinConns
}
func TestPoolSizing_DefaultsWhenNothingSet(t *testing.T) {
max, min := applyPoolSizing(t, "postgres://u:p@h/db?sslmode=disable", "", "")
if max != defaultMaxConns || min != defaultMinConns {
t.Fatalf("got max=%d min=%d, want %d/%d", max, min, defaultMaxConns, defaultMinConns)
}
}
func TestPoolSizing_URLParamsHonoredWhenEnvUnset(t *testing.T) {
url := "postgres://u:p@h/db?sslmode=disable&pool_max_conns=40&pool_min_conns=8"
max, min := applyPoolSizing(t, url, "", "")
if max != 40 || min != 8 {
t.Fatalf("URL params should win when env unset; got max=%d min=%d", max, min)
}
}
func TestPoolSizing_EnvOverridesURL(t *testing.T) {
url := "postgres://u:p@h/db?sslmode=disable&pool_max_conns=40&pool_min_conns=8"
max, min := applyPoolSizing(t, url, "100", "20")
if max != 100 || min != 20 {
t.Fatalf("env should win over URL; got max=%d min=%d", max, min)
}
}
func TestPoolSizing_PartialURLParam(t *testing.T) {
// Only pool_max_conns is set in URL — pool_min_conns should fall back to
// the code default, not pgx's built-in default (which would be 0).
url := "postgres://u:p@h/db?sslmode=disable&pool_max_conns=40"
max, min := applyPoolSizing(t, url, "", "")
if max != 40 {
t.Fatalf("URL pool_max_conns should be honored; got max=%d", max)
}
if min != defaultMinConns {
t.Fatalf("min should default; got min=%d, want %d", min, defaultMinConns)
}
}
func TestPoolSizing_InvalidEnvFallsBackToCodeDefault(t *testing.T) {
// Invalid env value with no URL pool param → code default, NOT pgx's
// built-in 4. This is the regression that was fixed; pinning it here
// so we don't silently fall back to the bad value again.
max, min := applyPoolSizing(t, "postgres://u:p@h/db?sslmode=disable", "not-a-number", "")
if max != defaultMaxConns {
t.Fatalf("invalid env should fall back to code default; got max=%d, want %d", max, defaultMaxConns)
}
if min != defaultMinConns {
t.Fatalf("got min=%d, want %d", min, defaultMinConns)
}
}
func TestPoolSizing_InvalidEnvFallsBackToURLParam(t *testing.T) {
// Invalid env value with a URL pool param → URL param wins, NOT pgx
// default. This is what makes the precedence chain end at "URL or code
// default" rather than at "pgx default" on misconfiguration.
url := "postgres://u:p@h/db?sslmode=disable&pool_max_conns=40"
max, _ := applyPoolSizing(t, url, "not-a-number", "")
if max != 40 {
t.Fatalf("invalid env should fall back to URL param; got max=%d, want 40", max)
}
}
func TestPoolSizing_MinClampedToMax(t *testing.T) {
max, min := applyPoolSizing(t, "postgres://u:p@h/db?sslmode=disable", "10", "50")
if min > max {
t.Fatalf("min should be clamped to max; got max=%d min=%d", max, min)
}
}

View File

@@ -9,6 +9,8 @@ import (
"syscall"
"time"
"github.com/jackc/pgx/v5/pgxpool"
"github.com/multica-ai/multica/server/internal/events"
"github.com/multica-ai/multica/server/internal/logger"
"github.com/multica-ai/multica/server/internal/realtime"
@@ -39,7 +41,7 @@ func main() {
// Connect to database
ctx := context.Background()
pool, err := newDBPool(ctx, dbURL)
pool, err := pgxpool.New(ctx, dbURL)
if err != nil {
slog.Error("unable to connect to database", "error", err)
os.Exit(1)

View File

@@ -671,6 +671,66 @@ func TestPrepareWithRepoContextOpencode(t *testing.T) {
}
}
// TestInjectRuntimeConfigRequiresExplicitCommentPost ensures the injected
// workflow makes "post a comment with results" an explicit, unmissable step in
// both the assignment- and comment-triggered branches, plus hard-warns in the
// Output section that terminal/log text is not user-visible. Agents were
// silently finishing tasks without ever posting their result to the issue; see
// MUL-1124. Covering this in a test prevents the guidance from decaying back
// into a nested clause again.
func TestInjectRuntimeConfigRequiresExplicitCommentPost(t *testing.T) {
t.Parallel()
assignmentCtx := TaskContextForEnv{IssueID: "issue-1"}
commentCtx := TaskContextForEnv{IssueID: "issue-1", TriggerCommentID: "comment-1"}
for _, tc := range []struct {
name string
ctx TaskContextForEnv
}{
{"assignment-triggered", assignmentCtx},
{"comment-triggered", commentCtx},
} {
tc := tc
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
dir := t.TempDir()
if err := InjectRuntimeConfig(dir, "claude", tc.ctx); err != nil {
t.Fatalf("InjectRuntimeConfig failed: %v", err)
}
data, err := os.ReadFile(filepath.Join(dir, "CLAUDE.md"))
if err != nil {
t.Fatalf("read CLAUDE.md: %v", err)
}
s := string(data)
// The workflow must contain an explicit `multica issue comment add`
// invocation for this issue — not just a prose mention of posting.
mustContain := []string{
"multica issue comment add issue-1",
"mandatory",
}
for _, want := range mustContain {
if !strings.Contains(s, want) {
t.Errorf("%s: CLAUDE.md missing %q\n---\n%s", tc.name, want, s)
}
}
// The Output section must carry a hard warning that terminal/log
// output is not user-visible. This is the second line of defense
// in case the agent skips past the workflow steps.
for _, want := range []string{
"Final results MUST be delivered via `multica issue comment add`",
"does NOT see your terminal output",
} {
if !strings.Contains(s, want) {
t.Errorf("%s: Output warning missing %q", tc.name, want)
}
}
})
}
}
func TestInjectRuntimeConfigUnknownProvider(t *testing.T) {
t.Parallel()
dir := t.TempDir()
@@ -1056,11 +1116,11 @@ network_access = true
func TestCodexSandboxPolicyFor(t *testing.T) {
t.Parallel()
cases := []struct {
name string
goos string
version string
wantMode string
wantNet bool
name string
goos string
version string
wantMode string
wantNet bool
}{
{"linux any version", "linux", "0.100.0", "workspace-write", true},
{"linux unknown version", "linux", "", "workspace-write", true},

View File

@@ -129,9 +129,9 @@ func buildMetaSkillContent(provider string, ctx TaskContextForEnv) string {
fmt.Fprintf(&b, "2. Run `multica issue comment list %s --output json` to read the conversation\n", ctx.IssueID)
b.WriteString(" - If the output is very large or truncated, use pagination: `--limit 30` to get the latest 30 comments, or `--since <timestamp>` to fetch only recent ones\n")
fmt.Fprintf(&b, "3. Find the triggering comment (ID: `%s`) and understand what is being asked — do NOT confuse it with previous comments\n", ctx.TriggerCommentID)
b.WriteString("4. ")
b.WriteString("4. If the comment requests code changes or further work, do the work first\n")
b.WriteString("5. **Post your reply as a comment — this step is mandatory.** Text in your terminal or run logs is NOT delivered to the user. ")
b.WriteString(BuildCommentReplyInstructions(ctx.IssueID, ctx.TriggerCommentID))
b.WriteString("5. If the comment requests code changes or further work, do the work first, then reply with your results\n")
b.WriteString("6. Do NOT change the issue status unless the comment explicitly asks for it\n\n")
} else {
// Assignment-triggered: defer to agent Skills for workflow specifics.
@@ -139,10 +139,10 @@ func buildMetaSkillContent(provider string, ctx TaskContextForEnv) string {
fmt.Fprintf(&b, "1. Run `multica issue get %s --output json` to understand your task\n", ctx.IssueID)
fmt.Fprintf(&b, "2. Run `multica issue status %s in_progress`\n", ctx.IssueID)
b.WriteString("3. Read comments for additional context or human instructions\n")
b.WriteString("4. Follow your Skills and Agent Identity to determine how to complete this task.\n")
b.WriteString(" If no relevant skill applies, the default workflow is: understand the task → do the work → post a comment with results → update issue status.\n")
fmt.Fprintf(&b, "5. When done, run `multica issue status %s in_review`\n", ctx.IssueID)
fmt.Fprintf(&b, "6. If blocked, run `multica issue status %s blocked` and post a comment explaining why\n\n", ctx.IssueID)
b.WriteString("4. Follow your Skills and Agent Identity to complete the task (write code, investigate, etc.)\n")
fmt.Fprintf(&b, "5. **Post your final results as a comment — this step is mandatory**: `multica issue comment add %s --content \"...\"`. Your results are only visible to the user if posted via this CLI call; text in your terminal or run logs is NOT delivered.\n", ctx.IssueID)
fmt.Fprintf(&b, "6. When done, run `multica issue status %s in_review`\n", ctx.IssueID)
fmt.Fprintf(&b, "7. If blocked, run `multica issue status %s blocked` and post a comment explaining why\n\n", ctx.IssueID)
}
if len(ctx.AgentSkills) > 0 {
@@ -190,6 +190,7 @@ func buildMetaSkillContent(provider string, ctx TaskContextForEnv) string {
b.WriteString("do NOT attempt to work around it. Instead, post a comment mentioning the workspace owner to request the missing functionality.\n\n")
b.WriteString("## Output\n\n")
b.WriteString("⚠️ **Final results MUST be delivered via `multica issue comment add`.** The user does NOT see your terminal output, assistant chat text, or run logs — only comments on the issue. A task that finishes without a result comment is invisible to the user, even if the work itself was correct.\n\n")
b.WriteString("Keep comments concise and natural — state the outcome, not the process.\n")
b.WriteString("Good: \"Fixed the login redirect. PR: https://...\"\n")
b.WriteString("Bad: \"1. Read the issue 2. Found the bug in auth.go 3. Created branch 4. ...\"\n")