mirror of
https://github.com/multica-ai/multica.git
synced 2026-06-17 11:48:42 +02:00
* feat(agent): persist thinking_level per agent (MUL-2339) Adds a nullable `thinking_level` column to the `agent` table so the backend can route a runtime-native reasoning/effort token (e.g. Claude's `xhigh`, Codex's `minimal`) through to the agent CLI on every dispatch. The column is intentionally TEXT rather than an enum — Claude and Codex publish overlapping but distinct vocabularies and we want the persisted value to round-trip exactly through whichever CLI receives it. NULL is the "use runtime default" sentinel that every downstream consumer reads as "do not inject --effort / reasoning_effort". This commit is just the storage layer (migration + sqlc); subsequent commits wire it through the API, daemon, and agent backends. Co-authored-by: multica-agent <github@multica.ai> * feat(agent-backend): inject reasoning effort for claude + codex (MUL-2339) Extends ExecOptions with a runtime-native ThinkingLevel string and wires it into the Claude and Codex backends. Discovery is driven by the local CLI so the daemon advertises whatever the host install supports rather than a hand-maintained list that goes stale. Per Elon's PR1 review: - Claude: parses `claude --help` to learn the `--effort` superset and projects through a per-model allow-list (xhigh is Opus-only; max is session-only on the smaller models). Falls back to a conservative static list when the binary is missing or help drift hides the line. - Codex: drives `codex debug models --output json` so per-model reasoning subsets and the documented default come directly from the CLI. The older config-error probe trick is gone — the JSON path is stable and doesn't pollute stderr with an intentional misconfig. - Cache key includes (provider, executablePath, cliVersion) so a CLI upgrade invalidates entries that referenced the older help / catalog. Per Trump's PR1 constraint, all three Codex injection points (thread/start.config, thread/resume.config, turn/start.effort) flow through one helper (`applyCodexReasoningEffort`) so they cannot drift independently. The shared `codexReasoningCases` fixture in `thinking_test.go` asserts the same value→{shape, key} contract at each site for every level the runtimes know about. Claude's `--effort` is also added to `claudeBlockedArgs` so a user custom_args entry can't silently outvote the daemon-injected value. Co-authored-by: multica-agent <github@multica.ai> * feat(api): wire thinking_level through API + daemon contract (MUL-2339) End-to-end plumbing for the per-agent reasoning/effort setting: - AgentResponse / TaskAgentData now carry `thinking_level`; the daemon's claim response includes it and the daemon's executor passes it through to agent.ExecOptions, where the Claude and Codex backends already know what to do with it. - ModelEntry on the runtime-models wire format gains a `thinking` block carrying `supported_levels` + `default_level` per model so the UI can render a runtime-aware picker without the server having to know about the local CLI install. `handleModelList` projects the agent-package catalog (including the new Thinking field) into the wire shape. - CreateAgent / UpdateAgent gate the field with a synchronous provider enum check (claude / codex only today). UpdateAgent is tri-state: field omitted = no change, "" = explicit clear (new `ClearAgentThinkingLevel` query, mirrors the existing mcp_config null pattern), non-empty = validate then set. Per Trump's PR1 review, the API NEVER auto-clears on a runtime/model swap and ALWAYS returns 400 on an unknown literal value — same shape across CreateAgent, UpdateAgent, and combined patches that move runtime + level in one request. Per-model combination failures (e.g. `xhigh` against a model that only supports up to `high`) surface as a daemon-side task error, not a silent server-side rewrite. TS types follow the same shape: `Agent.thinking_level`, `CreateAgentRequest`/`UpdateAgentRequest` add the field, `RuntimeModel` grows a `thinking` block. Older backends omit the field, which the front-end treats as "no picker for this model" — installed desktop builds keep working. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): correct codex debug models argv + pin via runner test (MUL-2339) `codex debug models --output json` is rejected by codex-cli 0.131.0 — the subcommand emits JSON on stdout by default and has no `--output` flag. Drop the flag and add `--bundled` to skip the network refresh discovery doesn't need. Move the argv to a package-level var and add a test that runs a fake `codex` to assert the binary actually receives exactly `debug models --bundled`, so the contract can't silently drift on the next refactor. Also teach ValidateThinkingLevel to resolve an empty model to the provider's default model entry. Without this, every default-model task with a persisted thinking_level would be misjudged "unknown model" by the daemon guard. Co-authored-by: multica-agent <github@multica.ai> * fix(api): reject runtime switch that would leave invalid thinking_level (MUL-2339) A PATCH that changed `runtime_id` without touching `thinking_level` used to silently keep the existing value, so a Claude agent storing `max` could land on a Codex runtime where `max` is not a recognised token at all, and the daemon would receive a literal-invalid level. Hold the same "always 400 on literal-invalid, never silent coerce" rule on this implicit path. When runtime_id changes and the existing value is not in the new provider's enum, return 400 with the recovery options (clear via `thinking_level=""` or re-set in the same PATCH). Add coverage for both the kept-when-still-valid and the rejected cases, plus the two recovery paths (clear and replace). Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): guard runTask with per-model thinking_level validator (MUL-2339) ValidateThinkingLevel existed but had no call site — `task.Agent. ThinkingLevel` flowed straight into ExecOptions, so `xhigh` configured on a non-Opus Claude model, or API-side stale values that escaped the provider enum gate, would be injected anyway. Run the validator before building ExecOptions. Invalid combinations log a warning and drop the level instead of failing the task: the agent still runs, just at the runtime's default reasoning effort. Discovery errors fail open (keep the level, let the CLI surface any objection) so a transient `claude --help` failure can't strand work. Empty model is forwarded as-is; the validator resolves it to the provider's default model internally per the cross-package contract. Co-authored-by: multica-agent <github@multica.ai> * chore(agent): drop stale `--output json` comments + unused scanner (MUL-2339) Codex CLI's `debug models` subcommand emits JSON without an `--output` flag, and `parseCodexDebugModels` never read from the bufio.Scanner. Sync the comments with the actual invocation and remove the dead init. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>
468 lines
17 KiB
Go
468 lines
17 KiB
Go
package agent
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
"os/exec"
|
|
"regexp"
|
|
"strings"
|
|
"sync"
|
|
"time"
|
|
)
|
|
|
|
// thinking.go discovers per-model reasoning/effort catalogs for the
|
|
// claude and codex backends so the daemon can advertise them to the
|
|
// UI without hard-coding (and getting wrong) what's installed locally.
|
|
//
|
|
// MUL-2339: we deliberately do not flatten Claude's `low|medium|high|
|
|
// xhigh|max` and Codex's `none|minimal|low|medium|high|xhigh` onto a
|
|
// shared enum — what users pick must round-trip exactly through each
|
|
// CLI's own value vocabulary.
|
|
|
|
// ── Cache ────────────────────────────────────────────────────────────
|
|
//
|
|
// Discovery is keyed on (provider, executablePath, cliVersion). Bumping
|
|
// the local CLI invalidates entries that referenced the older version's
|
|
// help/`debug models` output, which is exactly the failure mode we hit
|
|
// when Anthropic / OpenAI add or remove a level (Elon's review note).
|
|
|
|
type thinkingCacheKey struct {
|
|
provider string
|
|
executablePath string
|
|
cliVersion string
|
|
}
|
|
|
|
type thinkingCacheEntry struct {
|
|
value map[string]*ModelThinking // keyed by model ID
|
|
expiresAt time.Time
|
|
}
|
|
|
|
const thinkingDiscoveryTTL = 10 * time.Minute
|
|
|
|
var (
|
|
thinkingCacheMu sync.Mutex
|
|
thinkingCache = map[thinkingCacheKey]thinkingCacheEntry{}
|
|
)
|
|
|
|
func thinkingCacheGet(key thinkingCacheKey) (map[string]*ModelThinking, bool) {
|
|
thinkingCacheMu.Lock()
|
|
defer thinkingCacheMu.Unlock()
|
|
entry, ok := thinkingCache[key]
|
|
if !ok || time.Now().After(entry.expiresAt) {
|
|
return nil, false
|
|
}
|
|
return entry.value, true
|
|
}
|
|
|
|
func thinkingCachePut(key thinkingCacheKey, value map[string]*ModelThinking) {
|
|
thinkingCacheMu.Lock()
|
|
defer thinkingCacheMu.Unlock()
|
|
thinkingCache[key] = thinkingCacheEntry{value: value, expiresAt: time.Now().Add(thinkingDiscoveryTTL)}
|
|
}
|
|
|
|
// resetThinkingCacheForTests is exposed for tests only; production code
|
|
// must rely on the TTL or process restart for invalidation.
|
|
func resetThinkingCacheForTests() {
|
|
thinkingCacheMu.Lock()
|
|
thinkingCache = map[thinkingCacheKey]thinkingCacheEntry{}
|
|
thinkingCacheMu.Unlock()
|
|
}
|
|
|
|
// ── Claude ───────────────────────────────────────────────────────────
|
|
//
|
|
// `claude --help` advertises `--effort <level>` with the full superset
|
|
// in parentheses; we parse that line to learn which levels the CLI
|
|
// version on this host accepts. Per-model gaps (Opus-only `xhigh`,
|
|
// session-only `max`) come from a hand-maintained table because the
|
|
// CLI does not expose model→effort mappings programmatically.
|
|
|
|
// claudeEffortRe matches the help line emitted by `claude --help`:
|
|
//
|
|
// --effort <level> Effort level for the current session (low, medium, high, xhigh, max)
|
|
//
|
|
// Anchored on `--effort` and lenient about whitespace so flag-name
|
|
// reformats (`--effort=…`, indented help blocks) do not break parsing.
|
|
var claudeEffortRe = regexp.MustCompile(`--effort\s*(?:<[^>]+>)?\s*(?:Effort level[^(]*)?\(([^)]+)\)`)
|
|
|
|
// claudeEffortLabel maps Claude's raw level token to the display label
|
|
// the UI should render. Title-case matches Anthropic's own slash UI.
|
|
var claudeEffortLabel = map[string]string{
|
|
"low": "Low",
|
|
"medium": "Medium",
|
|
"high": "High",
|
|
"xhigh": "Extra high",
|
|
"max": "Max",
|
|
}
|
|
|
|
// claudeModelEffortAllow restricts the level set per model where the
|
|
// upstream documentation says only some are valid. Empty / missing
|
|
// model → use the parsed superset as-is (current Claude Code default).
|
|
// Update this map when Anthropic publishes a new model that does not
|
|
// support `xhigh` / `max`.
|
|
var claudeModelEffortAllow = map[string]map[string]bool{
|
|
// Opus is the only model that publicly supports xhigh; the help
|
|
// list still includes it for Sonnet / Haiku so we filter here.
|
|
"claude-opus-4-7": {"low": true, "medium": true, "high": true, "xhigh": true, "max": true},
|
|
"claude-opus-4-6": {"low": true, "medium": true, "high": true, "xhigh": true, "max": true},
|
|
"claude-sonnet-4-6": {"low": true, "medium": true, "high": true, "max": true},
|
|
"claude-sonnet-4-5": {"low": true, "medium": true, "high": true, "max": true},
|
|
"claude-haiku-4-5-20251001": {"low": true, "medium": true, "high": true},
|
|
}
|
|
|
|
// claudeStaticEffortFallback is the conservative subset used when
|
|
// parsing the `--effort` help line fails (binary missing, output drift,
|
|
// etc.). Picked from the lowest-common-denominator across recent
|
|
// Claude Code releases.
|
|
var claudeStaticEffortFallback = []string{"low", "medium", "high"}
|
|
|
|
// claudeStaticEffortFullSuperset is what `claude --help` listed on
|
|
// 2.1.121. Used as the catalog superset when a model isn't in the
|
|
// per-model allow-list — we'd rather over-offer and let the CLI
|
|
// reject than artificially block valid combinations.
|
|
var claudeStaticEffortFullSuperset = []string{"low", "medium", "high", "xhigh", "max"}
|
|
|
|
// annotateClaudeThinking populates each entry's Thinking field by
|
|
// running `claude --help` once and projecting the parsed superset
|
|
// through claudeModelEffortAllow. Errors are silently absorbed so a
|
|
// missing CLI doesn't break model listing — the UI just hides the
|
|
// picker for that model.
|
|
func annotateClaudeThinking(ctx context.Context, models []Model, executablePath string) {
|
|
mapping := loadClaudeThinkingByModel(ctx, executablePath)
|
|
for i := range models {
|
|
if t, ok := mapping[models[i].ID]; ok && t != nil {
|
|
models[i].Thinking = t
|
|
}
|
|
}
|
|
}
|
|
|
|
func loadClaudeThinkingByModel(ctx context.Context, executablePath string) map[string]*ModelThinking {
|
|
if executablePath == "" {
|
|
executablePath = "claude"
|
|
}
|
|
version, _ := DetectVersion(ctx, executablePath)
|
|
key := thinkingCacheKey{provider: "claude", executablePath: executablePath, cliVersion: version}
|
|
if cached, ok := thinkingCacheGet(key); ok {
|
|
return cached
|
|
}
|
|
|
|
superset := claudeEffortSuperset(ctx, executablePath)
|
|
result := map[string]*ModelThinking{}
|
|
for _, m := range claudeStaticModels() {
|
|
allow := claudeModelEffortAllow[m.ID]
|
|
levels := projectClaudeLevels(superset, allow)
|
|
if len(levels) == 0 {
|
|
continue
|
|
}
|
|
result[m.ID] = &ModelThinking{
|
|
SupportedLevels: levels,
|
|
DefaultLevel: "medium",
|
|
}
|
|
}
|
|
thinkingCachePut(key, result)
|
|
return result
|
|
}
|
|
|
|
// claudeEffortSuperset returns the parsed `--effort` value list. When
|
|
// parsing fails it returns the static fallback rather than nothing so
|
|
// callers can still render a usable picker.
|
|
func claudeEffortSuperset(ctx context.Context, executablePath string) []string {
|
|
cmd := exec.CommandContext(ctx, executablePath, "--help")
|
|
hideAgentWindow(cmd)
|
|
out, err := cmd.CombinedOutput()
|
|
if err != nil {
|
|
return append([]string(nil), claudeStaticEffortFallback...)
|
|
}
|
|
parsed := parseClaudeEffortHelp(string(out))
|
|
if len(parsed) == 0 {
|
|
// Help format drifted — fall back to the last known good
|
|
// superset rather than the conservative subset, so newer
|
|
// levels are still offered until we hand-edit the fallback.
|
|
return append([]string(nil), claudeStaticEffortFullSuperset...)
|
|
}
|
|
return parsed
|
|
}
|
|
|
|
// parseClaudeEffortHelp extracts the comma-separated value list from a
|
|
// `--effort` help line. Returns nil if the line is missing or the
|
|
// captured group is empty so callers can pick a fallback path.
|
|
func parseClaudeEffortHelp(helpText string) []string {
|
|
match := claudeEffortRe.FindStringSubmatch(helpText)
|
|
if len(match) < 2 {
|
|
return nil
|
|
}
|
|
var out []string
|
|
for _, raw := range strings.Split(match[1], ",") {
|
|
token := strings.TrimSpace(raw)
|
|
if token == "" {
|
|
continue
|
|
}
|
|
out = append(out, token)
|
|
}
|
|
return out
|
|
}
|
|
|
|
func projectClaudeLevels(superset []string, allow map[string]bool) []ThinkingLevel {
|
|
out := make([]ThinkingLevel, 0, len(superset))
|
|
for _, value := range superset {
|
|
if allow != nil && !allow[value] {
|
|
continue
|
|
}
|
|
label, ok := claudeEffortLabel[value]
|
|
if !ok {
|
|
// New value the daemon hasn't been taught yet — surface
|
|
// it raw so power users can still pick it.
|
|
label = strings.Title(value) //nolint:staticcheck
|
|
}
|
|
out = append(out, ThinkingLevel{Value: value, Label: label})
|
|
}
|
|
return out
|
|
}
|
|
|
|
// ── Codex ────────────────────────────────────────────────────────────
|
|
//
|
|
// `codex debug models` is the structured discovery hook Elon's review
|
|
// flagged. It returns the per-model reasoning catalog directly,
|
|
// including the model's documented default. We prefer this over the
|
|
// older config-error probe trick because:
|
|
// 1. It gives us per-model subsets without hand-maintained tables.
|
|
// 2. The schema is stable across CLI versions (Codex 0.131.0+).
|
|
// 3. It doesn't pollute stderr with an intentional misconfiguration.
|
|
//
|
|
// The subcommand emits JSON on stdout by default — there is no
|
|
// `--output json` flag (a prior version of this code passed one and
|
|
// silently failed on 0.131.0). We add `--bundled` to skip the network
|
|
// refresh: discovery runs on every daemon poll and a network hop here
|
|
// would block the picker behind whatever the user's connection allows.
|
|
// The bundled catalog is what determines which `model_reasoning_effort`
|
|
// tokens the local binary actually accepts, which is the only thing we
|
|
// need for validation.
|
|
//
|
|
// On older Codex versions / failures, the picker just disappears for
|
|
// that model rather than offering a wrong list.
|
|
|
|
// codexEffortLabel is the human display string for each Codex effort
|
|
// value, matching Codex's own TUI (`Extra high`, `Minimal`, …) so
|
|
// users see the same labels across our picker and `codex /model`.
|
|
var codexEffortLabel = map[string]string{
|
|
"none": "None",
|
|
"minimal": "Minimal",
|
|
"low": "Low",
|
|
"medium": "Medium",
|
|
"high": "High",
|
|
"xhigh": "Extra high",
|
|
}
|
|
|
|
// codexDebugModelsResponse mirrors the JSON shape emitted by
|
|
// `codex debug models` (Codex 0.131.0+). Only the fields we
|
|
// consume are typed; unknown keys are ignored.
|
|
type codexDebugModelsResponse struct {
|
|
Models []struct {
|
|
Slug string `json:"slug"`
|
|
DefaultReasoningLevel string `json:"default_reasoning_level"`
|
|
SupportedReasoningLevel []struct {
|
|
Effort string `json:"effort"`
|
|
Description string `json:"description"`
|
|
} `json:"supported_reasoning_levels"`
|
|
} `json:"models"`
|
|
}
|
|
|
|
// annotateCodexThinking decorates each model entry with its reasoning
|
|
// catalog. Models the CLI doesn't know about (older codex install,
|
|
// brand-new ID we haven't shipped) get Thinking=nil — the UI hides
|
|
// the picker for those rows rather than guessing.
|
|
func annotateCodexThinking(ctx context.Context, models []Model, executablePath string) {
|
|
mapping := loadCodexThinkingByModel(ctx, executablePath)
|
|
for i := range models {
|
|
if t, ok := mapping[models[i].ID]; ok && t != nil {
|
|
models[i].Thinking = t
|
|
}
|
|
}
|
|
}
|
|
|
|
func loadCodexThinkingByModel(ctx context.Context, executablePath string) map[string]*ModelThinking {
|
|
if executablePath == "" {
|
|
executablePath = "codex"
|
|
}
|
|
version, _ := DetectVersion(ctx, executablePath)
|
|
key := thinkingCacheKey{provider: "codex", executablePath: executablePath, cliVersion: version}
|
|
if cached, ok := thinkingCacheGet(key); ok {
|
|
return cached
|
|
}
|
|
|
|
raw, err := runCodexDebugModels(ctx, executablePath)
|
|
if err != nil {
|
|
// Cache the empty result so repeated UI polls don't re-shell
|
|
// the missing binary; TTL eventually retries.
|
|
thinkingCachePut(key, map[string]*ModelThinking{})
|
|
return map[string]*ModelThinking{}
|
|
}
|
|
parsed := parseCodexDebugModels(raw)
|
|
thinkingCachePut(key, parsed)
|
|
return parsed
|
|
}
|
|
|
|
// codexDebugModelsArgs is the argv we pass to discover the local Codex
|
|
// catalog. Kept as a package-level var (not a literal at the call site)
|
|
// so tests can assert the exact form a real `codex` invocation receives,
|
|
// not just the parser behavior on a fixture string. The argv shape is
|
|
// the contract that broke under PR1 review; the test that pins it sits
|
|
// in thinking_test.go.
|
|
var codexDebugModelsArgs = []string{"debug", "models", "--bundled"}
|
|
|
|
func runCodexDebugModels(ctx context.Context, executablePath string) ([]byte, error) {
|
|
cmd := exec.CommandContext(ctx, executablePath, codexDebugModelsArgs...)
|
|
hideAgentWindow(cmd)
|
|
return cmd.Output()
|
|
}
|
|
|
|
// parseCodexDebugModels takes the JSON payload from `codex debug
|
|
// models` and projects it into a per-model thinking catalog.
|
|
// Returns an empty map (never nil) so callers can compose safely
|
|
// without nil-checking the result.
|
|
func parseCodexDebugModels(raw []byte) map[string]*ModelThinking {
|
|
out := map[string]*ModelThinking{}
|
|
var resp codexDebugModelsResponse
|
|
if err := json.Unmarshal(raw, &resp); err != nil {
|
|
return out
|
|
}
|
|
for _, m := range resp.Models {
|
|
if m.Slug == "" || len(m.SupportedReasoningLevel) == 0 {
|
|
continue
|
|
}
|
|
levels := make([]ThinkingLevel, 0, len(m.SupportedReasoningLevel))
|
|
for _, lvl := range m.SupportedReasoningLevel {
|
|
if lvl.Effort == "" {
|
|
continue
|
|
}
|
|
label, ok := codexEffortLabel[lvl.Effort]
|
|
if !ok {
|
|
label = strings.Title(lvl.Effort) //nolint:staticcheck
|
|
}
|
|
levels = append(levels, ThinkingLevel{
|
|
Value: lvl.Effort,
|
|
Label: label,
|
|
Description: lvl.Description,
|
|
})
|
|
}
|
|
if len(levels) == 0 {
|
|
continue
|
|
}
|
|
out[m.Slug] = &ModelThinking{
|
|
SupportedLevels: levels,
|
|
DefaultLevel: m.DefaultReasoningLevel,
|
|
}
|
|
}
|
|
return out
|
|
}
|
|
|
|
// ── Shared validation ────────────────────────────────────────────────
|
|
|
|
// ValidateThinkingLevel reports whether `value` is in the supported
|
|
// catalog for the given (provider, model) pair. Empty value is always
|
|
// valid — it means "use the runtime default".
|
|
//
|
|
// Empty model is treated as "use the provider's default model"; we
|
|
// resolve it through ListModels so the daemon's pre-execution guard
|
|
// behaves the same whether the agent picked an explicit model or
|
|
// inherited the runtime default. Without this, a default-model task
|
|
// with a valid thinking_level would be rejected on the grounds that
|
|
// the empty string is not in the catalog — exactly the misjudgement
|
|
// Elon flagged in the PR1 review.
|
|
//
|
|
// The lookup goes through ListModels so it sees the *current* CLI
|
|
// catalog (including dynamic discovery for codex), not just a static
|
|
// map. The function is intentionally pure of HTTP concerns so the
|
|
// daemon's pre-execution guard and the server's UpdateAgent gate can
|
|
// share the same source of truth.
|
|
func ValidateThinkingLevel(ctx context.Context, providerType, executablePath, model, value string) (bool, error) {
|
|
if value == "" {
|
|
return true, nil
|
|
}
|
|
models, err := ListModels(ctx, providerType, executablePath)
|
|
if err != nil {
|
|
return false, err
|
|
}
|
|
target := model
|
|
if target == "" {
|
|
// Default model = the entry the catalog marks as Default. If no
|
|
// entry is flagged, fall through to the no-match return; that
|
|
// matches the existing semantics where an unknown model fails
|
|
// closed rather than guessing.
|
|
for _, m := range models {
|
|
if m.Default {
|
|
target = m.ID
|
|
break
|
|
}
|
|
}
|
|
if target == "" {
|
|
return false, nil
|
|
}
|
|
}
|
|
for _, m := range models {
|
|
if m.ID != target {
|
|
continue
|
|
}
|
|
if m.Thinking == nil {
|
|
return false, nil
|
|
}
|
|
for _, lvl := range m.Thinking.SupportedLevels {
|
|
if lvl.Value == value {
|
|
return true, nil
|
|
}
|
|
}
|
|
return false, nil
|
|
}
|
|
return false, nil
|
|
}
|
|
|
|
// providerThinkingEnums is the server-side accept-list for each runtime's
|
|
// reasoning-effort vocabulary. The server doesn't have local CLI binaries,
|
|
// so it cannot do per-model discovery the way the daemon can; what it CAN
|
|
// do is reject values that are not in any version of the provider's enum
|
|
// at all. Per-model gaps (e.g. user sets `xhigh` while the chosen model
|
|
// only supports up to `high`) surface as a daemon-side task failure with
|
|
// a clear error, not a server-side 400 — that split is intentional so the
|
|
// API behaviour stays consistent (always-400 on literal-invalid, never
|
|
// auto-clear on combination-invalid). See MUL-2339 review notes.
|
|
//
|
|
// Keep these lists permissive: they're a "is this a known token in this
|
|
// runtime's universe" check, not an "is this the right level for this
|
|
// model" check. Adding a new level upstream means adding it here too so
|
|
// users can persist it before the next discovery refresh.
|
|
var providerThinkingEnums = map[string]map[string]bool{
|
|
"claude": {
|
|
"low": true,
|
|
"medium": true,
|
|
"high": true,
|
|
"xhigh": true,
|
|
"max": true,
|
|
},
|
|
"codex": {
|
|
"none": true,
|
|
"minimal": true,
|
|
"low": true,
|
|
"medium": true,
|
|
"high": true,
|
|
"xhigh": true,
|
|
},
|
|
}
|
|
|
|
// IsKnownThinkingValue reports whether `value` is a recognised effort
|
|
// token for the given provider. Empty string is always accepted (means
|
|
// "use runtime default"). Unknown providers (no thinking concept) accept
|
|
// only empty.
|
|
//
|
|
// This is the cheap synchronous gate the server uses on CreateAgent /
|
|
// UpdateAgent. Unlike ValidateThinkingLevel it does NOT consult the live
|
|
// catalog or per-model subset.
|
|
func IsKnownThinkingValue(providerType, value string) bool {
|
|
if value == "" {
|
|
return true
|
|
}
|
|
enum, ok := providerThinkingEnums[providerType]
|
|
if !ok {
|
|
return false
|
|
}
|
|
return enum[value]
|
|
}
|