Jiayuan Zhang a56af8a88b fix(server): MUL-4059 review fixes (P0-P2 + regression tests)
Addresses the converged P0-P2 review list from the adversarial
Code Review + Jack review. All four P0 blockers + the four P1
findings the reviewers asked to be re-examined on second pass.

**P0-1 Migration order (CRITICAL)**
- Move the backfill UPDATE to AFTER the ADD COLUMNs in
  migration 120. The original draft put UPDATE at the top before
  the column existed, which (a) aborts transactional migration
  runners and (b) silently no-ops the backfill on non-transactional
  runners, leaving every running task with last_activity_at IS
  NULL — the inactivity sweeper then trips the cold-start branch
  and kills every in-flight task on the first tick (auto-retry
  snowballs).
- The UPDATE is now gated on IS NULL so re-runs are idempotent.

**P0-2 Per-task max_inactivity_secs ignored**
- FailInactiveRunningTasks used a single @max_inactivity_secs
  scalar; sweeper passed only the server default. The whole task
  > agent > workspace > server-default chain was dead code.
- Replace the SQL with COALESCE(max_inactivity_secs, 1200) for
  per-row comparison. The function signature drops the
  parameter (no longer needed). Daemon-side runInactivityWatcher
  was already per-task via opts.MaxInactivitySecs; the two
  layers now agree.
- Regression test TestSweepInactiveTasks_PerRowMaxInactivityCap
  pins the per-row cap: two tasks with cap=60s, one idle 30s
  (alive), one idle 90s (dead).

**P0-3 errChatTaskContextMissing unexported**
- Rename to ErrChatTaskContextMissing + add OutcomeNoContext
  sentinel in lark/dispatcher.
- Web chat handler (handler/chat.go) now errors.Is on the
  sentinel and returns 422 with a tailored 'workspace has no
  linked repo' message.
- Lark dispatcher emits the new outcome so the chat user sees
  the same explanation in their Lark thread.

**P0-4 Autopilot guard never honors policy=off**
- Bug: autopilot used reason.Policy (never set) instead of the
  first return value of guardDecision. Off-policy autopilot
  enqueues were always rejected. Other enqueue paths (issue /
  mention / quick_create / chat) correctly used the first return
  value. Aligned.

**P1-5 sweepPendingContextTasks never checks project local_directory**
- pendingContextProjectID returned invalid UUID unconditionally
  ('avoid a per-row round-trip'). (B)-only workspaces could never
  requeue. Now does GetIssue(task.IssueID).ProjectID; the
  round-trip is rounding error against the partial-index lookup
  already happening.
- Regression test TestSweepPendingContextTasks_ProjectLocalDirectoryPasses
  seeds a project with local_directory on a (B)-only workspace
  and asserts the task requeues.

**P1-6 parkIssueTaskPendingContext no transaction**
- Concurrent cancel between CreateAgentTask and
  MarkAgentTaskPendingContext left the row in 'cancelled' but
  still flipped issue to 'blocked' + posted a misleading system
  comment. Wrap both writes in runInTx so the row is either
  fully parked or fully cancelled. Post-write side effects
  (issue flip / comment / event) stay outside the transaction.

**P1-6 (companion) workspace resolve failure -> task hangs forever**
- sweepPendingContextTasks d on workspaceIDFromTask
  returning invalid, so the revalidation counter never advanced.
  Now cancels the row with failure_reason='no_context' so the
  user sees 'workspace was deleted' instead of a stuck card.

**P1-7 Session resume reuses hung session**
- inactivity_timeout was in retryableReasons but NOT in
  GetLastTaskSession / GetLastChatTaskSession / CreateRetryTask
  CASE WHEN. Auto-retry would replay the hang.
- Aligned: inactivity_timeout is now in the resume-unsafe set
  alongside codex_semantic_inactivity. Three places + the Go
  helper resumeUnsafeFailureReason agree.
- Server can't tell 'genuine hang' from 'long build', so fresh
  session is the safe default.

**P1-8 issue blocked->in_progress re-switch missing**
- sweep requeue only called NotifyTaskEnqueued; the issue stayed
  blocked. New helper resumeIssueFromBlocked flips the issue
  back to in_progress (best-effort, skips terminal issues) and
  posts a 'Context guard now passes; resuming the task.' system
  comment.

**P1-10 HasActiveTaskForIssue / RefreshAgentStatusFromTasks missed**
- HasActiveTaskForIssue now includes pending_context so
  HandleFailedTasks's fallback doesn't flip blocked issues back
  to todo. RefreshAgentStatusFromTasks intentionally does NOT
  include pending_context so a parked task doesn't make the
  agent 'working' (presence indicator stays correct via
  ListWorkspaceAgentTaskSnapshot which already includes it).

**P2-8 autopilot claim only fills (A) not (B)**
- daemon handler for autopilot run_only now also fills
  ProjectResources from the autopilot's project so the daemon's
  secondary guard (daemonTaskHasUsableContext) sees local_directory.

**P2-9 down migration doesn't clean pending_context rows**
- down.sql now does UPDATE ... SET status='cancelled' WHERE
  status='pending_context' before DROPping the column. Without
  this, down was not fully reversible.

**P2-9 (companion) EventTaskPendingContext**
- Added the event constant in events.go and emit it from
  parkIssueTaskPendingContext (replacing the EventTaskQueued
  broadcast that masqueraded).

**P2-10 ContextGuardReason field name misleading**
- Daemon never read it. Dropped the field from both
  AgentTaskResponse and the daemon Task struct. The audit JSON
  envelope stays in the DB row for the task-detail endpoint.

**P2-11 EncodeReason dead code**
- Removed the vestigial  block.

**Test changes**
- 4 new unit tests: TestResumeUnsafeFailureReason_InactivityTimeoutAdded,
  TestErrChatTaskContextMissing_Exported, TestSweepInactiveTasks_PerRowMaxInactivityCap,
  TestSweepPendingContextTasks_ProjectLocalDirectoryPasses.
- 1 simplified unit test: TestTaskStruct_NewFieldsPickedUp (dropped
  the ContextGuardReason round-trip since the field is gone).

Co-authored-by: multica-agent <github@multica.ai>
2026-06-13 11:59:22 +08:00

Multica — humans and agents, side by side

Multica

Multica

Your next 10 hires won't be human.

The open-source managed agents platform.
Turn coding agents into real teammates — assign tasks, track progress, compound skills.

CI GitHub stars

Website · Cloud · X · Self-Hosting · Contributing

English | 简体中文

What is Multica?

Multica turns coding agents into real teammates. Assign issues to an agent like you'd assign to a colleague — they'll pick up the work, write code, report blockers, and update statuses autonomously.

No more copy-pasting prompts. No more babysitting runs. Your agents show up on the board, participate in conversations, and compound reusable skills over time. Think of it as open-source infrastructure for managed agents — vendor-neutral, self-hosted, and designed for human + AI teams. Works with Claude Code, Codex, GitHub Copilot CLI, OpenClaw, OpenCode, Hermes, Gemini, Pi, Cursor Agent, Kimi, and Kiro CLI.

For larger teams, Squads add a stable routing layer: assign work to a group led by an agent, and the leader delegates to the right member.

Multica board view

Why "Multica"?

Multica — Multiplexed Information and Computing Agent.

The name is a nod to Multics, the pioneering operating system of the 1960s that introduced time-sharing — letting multiple users share a single machine as if each had it to themselves. Unix was born as a deliberate simplification of Multics: one user, one task, one elegant philosophy.

We think the same inflection is happening again. For decades, software teams have been single-threaded — one engineer, one task, one context switch at a time. AI agents change that equation. Multica brings time-sharing back, but for an era where the "users" multiplexing the system are both humans and autonomous agents.

In Multica, agents are first-class teammates. They get assigned issues, report progress, raise blockers, and ship code — just like their human colleagues. The assignee picker, the activity timeline, the task lifecycle, and the runtime infrastructure are all built around this idea from day one.

Like Multics before it, the bet is on multiplexing: a small team shouldn't feel small. With the right system, two engineers and a fleet of agents can move like twenty.

Features

Multica manages the full agent lifecycle: from task assignment to execution monitoring to skill reuse.

  • Agents as Teammates — assign to an agent like you'd assign to a colleague. They have profiles, show up on the board, post comments, create issues, and report blockers proactively.
  • Squads — group agents (and humans) under a leader agent and assign work to the squad. The leader decides who should pick it up, so routing stays stable as the team grows. @FrontendTeam instead of @alice-or-bob-or-carol.
  • Autonomous Execution — set it and forget it. Full task lifecycle management (enqueue, claim, start, complete/fail) with real-time progress streaming via WebSocket.
  • Autopilots — schedule recurring work for agents. Cron triggers, webhooks, or manual runs — each autopilot creates the issue and routes it to an agent automatically, so daily standups, weekly reports, and periodic audits run themselves.
  • Reusable Skills — every solution becomes a reusable skill for the whole team. Deployments, migrations, code reviews — skills compound your team's capabilities over time.
  • Unified Runtimes — one dashboard for all your compute. Local daemons and cloud runtimes, auto-detection of available CLIs, real-time monitoring.
  • Multi-Workspace — organize work across teams with workspace-level isolation. Each workspace has its own agents, issues, and settings.

Quick Install

brew install multica-ai/tap/multica

Use brew upgrade multica-ai/tap/multica to keep the CLI current.

macOS / Linux (install script)

curl -fsSL https://raw.githubusercontent.com/multica-ai/multica/main/scripts/install.sh | bash

Use this if Homebrew is not available. The script installs the Multica CLI on macOS and Linux by using Homebrew when it is on PATH, otherwise it downloads the binary directly.

Windows (PowerShell)

irm https://raw.githubusercontent.com/multica-ai/multica/main/scripts/install.ps1 | iex

Then configure, authenticate, and start the daemon in one command:

multica setup          # Connect to Multica Cloud, log in, start daemon

Self-hosting? Add --with-server to deploy a full Multica server on your machine:

curl -fsSL https://raw.githubusercontent.com/multica-ai/multica/main/scripts/install.sh | bash -s -- --with-server
multica setup self-host

This pulls the official Multica images from GHCR (latest stable by default). Requires Docker. See the Self-Hosting Guide for details. If the selected GHCR tag has not been published yet, fall back to make selfhost-build from a checkout.


Getting Started

1. Set up and start the daemon

multica setup           # Configure, authenticate, and start the daemon

The daemon runs in the background and auto-detects agent CLIs (claude, codex, copilot, openclaw, opencode, hermes, gemini, pi, cursor-agent, kimi, kiro-cli, agy) on your PATH.

2. Verify your runtime

Open your workspace in the Multica web app. Navigate to Settings → Runtimes — you should see your machine listed as an active Runtime.

What is a Runtime? A Runtime is a compute environment that can execute agent tasks. It can be your local machine (via the daemon) or a cloud instance. Each runtime reports which agent CLIs are available, so Multica knows where to route work.

3. Create an agent

Go to Settings → Agents and click New Agent. Pick the runtime you just connected and choose a provider (Claude Code, Codex, GitHub Copilot CLI, OpenClaw, OpenCode, Hermes, Gemini, Pi, Cursor Agent, Kimi, Kiro CLI, or Antigravity). Give your agent a name — this is how it will appear on the board, in comments, and in assignments.

4. Assign your first task

Create an issue from the board (or via multica issue create), then assign it to your new agent. The agent will automatically pick up the task, execute it on your runtime, and report progress — just like a human teammate.


CLI

The multica CLI connects your local machine to Multica — authenticate, manage workspaces, and run the agent daemon.

Command Description
multica login Authenticate (opens browser)
multica daemon start Start the local agent runtime
multica daemon status Check daemon status
multica setup One-command setup for Multica Cloud (configure + login + start daemon)
multica setup self-host Same, but for self-hosted deployments
multica workspace list List your workspaces (current is marked with *)
multica workspace switch <id|slug> Switch the default workspace for this profile
multica issue list List issues in your workspace
multica issue create Create a new issue
multica update Update to the latest version

See the CLI and Daemon Guide for the full command reference.


Architecture

┌──────────────┐     ┌──────────────┐     ┌──────────────────┐
│   Next.js    │────>│  Go Backend  │────>│   PostgreSQL     │
│   Frontend   │<────│  (Chi + WS)  │<────│   (pgvector)     │
└──────────────┘     └──────┬───────┘     └──────────────────┘
                            │
                     ┌──────┴───────┐
                     │ Agent Daemon │  runs on your machine
                     └──────────────┘  (Claude Code, Codex, GitHub Copilot CLI,
                                        OpenCode, OpenClaw, Hermes, Gemini,
                                        Pi, Cursor Agent, Kimi, Kiro CLI)
Layer Stack
Frontend Next.js 16 (App Router)
Backend Go (Chi router, sqlc, gorilla/websocket)
Database PostgreSQL 17 with pgvector
Agent Runtime Local daemon executing Claude Code, Codex, GitHub Copilot CLI, OpenClaw, OpenCode, Hermes, Gemini, Pi, Cursor Agent, Kimi, or Kiro CLI

Development

For contributors working on the Multica codebase, see the Contributing Guide.

Prerequisites: Node.js v20+, pnpm v10.28+, Go v1.26+, Docker

make dev

make dev auto-detects your environment (main checkout or worktree), creates the env file, installs dependencies, sets up the database, runs migrations, and starts all services.

See CONTRIBUTING.md for the full development workflow, worktree support, testing, and troubleshooting.

An iOS mobile client lives in apps/mobile/ — see its README for how to build it onto your own iPhone.

Description
No description provided
Readme 216 MiB
Languages
Go 45.8%
TypeScript 45%
MDX 7.6%
PLpgSQL 0.5%
CSS 0.4%
Other 0.6%