Compare commits

...

5 Commits

Author SHA1 Message Date
Jiang Bohan
b62c7b83ae fix(daemon): machine-scoped daemon.id so CLI and Desktop share one identity
Dev found that starting Desktop after CLI (or vice versa) minted a second
runtime row per provider instead of re-binding to the existing one. Root
cause: EnsureDaemonID wrote under the profile directory, so the CLI daemon
(default profile) and the Desktop daemon (its own `desktop-<host>` profile)
each generated their own UUID even though they're the same machine.

daemon.id now lives at `~/.multica/daemon.id` only — machine-scoped, not
profile-scoped. Profiles still own their own config.json / log / token;
only identity is shared. With one id per machine, the existing unique
constraint `(workspace_id, daemon_id, provider)` naturally collapses
CLI+Desktop into a single runtime row per provider.

Test reversal: replaced TestEnsureDaemonID_ProfileIsolated with
TestEnsureDaemonID_MachineScopedAcrossProfiles, which pins the new
invariant (two sequential calls return the same UUID; no per-profile
daemon.id is written).
2026-04-17 15:27:22 +08:00
Jiang Bohan
2c9f1ac6c1 Merge remote-tracking branch 'origin/main' into agent/j/49ba9293
# Conflicts:
#	server/internal/handler/daemon_test.go
2026-04-17 15:02:45 +08:00
Jiang Bohan
b8bc828a6b fix(daemon): consolidate every case-duplicate legacy runtime, not just the first
Follow-up review on #1220: after switching to `LOWER(daemon_id) =
LOWER(@daemon_id)`, the single-row lookup still only merged one legacy
row per candidate. If a machine already had two rows in the DB that
differed only in casing (e.g. `Jiayuans-MacBook-Pro.local` AND
`jiayuans-macbook-pro.local` coexisting because earlier hostname drift
already minted a duplicate), only one of them got consolidated and the
other stayed orphaned — violating the "no duplicate runtime per machine
after backfill" acceptance.

- FindLegacyRuntimeByDaemonID → FindLegacyRuntimesByDaemonID (:many)
- mergeLegacyRuntimes iterates every returned row and dedupes across
  overlapping legacy candidates so `foo` and `foo.local` both resolving
  to the same stored row don't double-process

Test: TestDaemonRegister_MergesAllCaseDuplicateLegacyRuntimes seeds two
case-duplicate rows with one agent each and confirms both rows are
deleted and both agents end up on the new UUID-keyed row.
2026-04-17 14:31:48 +08:00
Jiang Bohan
35a2fb24ec fix(daemon): handle bidirectional .local drift and case drift in legacy merge
Review on #1220 flagged two gaps in the legacy-id migration candidate set:

1. Reverse .local: LegacyDaemonIDs only added the stripped variant when the
   current hostname ended in `.local`. The opposite direction — DB has
   `foo.local`, current host is `foo` — was missed, so runtimes registered
   under the `.local` variant stayed orphaned after upgrade. Now both
   variants (`foo` and `foo.local`) are always emitted, regardless of what
   `os.Hostname()` currently returns, plus their `-<profile>` suffix forms.

2. Case drift: os.Hostname() has been observed returning different casings
   on the same machine across mDNS/reboot state. A case-sensitive `=`
   comparison stranded rows like `Jiayuans-MacBook-Pro.local` when the
   daemon later reported `jiayuans-macbook-pro.local`. FindLegacyRuntimeByDaemonID
   now uses `LOWER(daemon_id) = LOWER(@daemon_id)` on both sides, so casing
   differences merge rather than orphan. The (workspace_id, provider) prefix
   still bounds the scan to a tiny set of rows so the non-indexed LOWER()
   comparison has negligible cost.

Tests: TestLegacyDaemonIDs gets the mixed-case + reverse-direction cases;
daemon_test.go adds TestDaemonRegister_MergesLegacyDaemonIDRuntime_ReverseDotLocal
and TestDaemonRegister_MergesLegacyDaemonIDRuntime_CaseDrift.
2026-04-17 14:22:39 +08:00
Jiang Bohan
a4616c6135 feat(daemon): persistent UUID identity + legacy-id merge at register-time
daemon_id is now a stable UUID persisted to `<profile-dir>/daemon.id` on
first start, replacing the hostname-derived id that drifted whenever
`.local` appeared/disappeared, a system was renamed, or a profile
switched — each of which used to mint a fresh `agent_runtime` row and
strand agents on the old one.

To migrate existing installs without operator intervention, the daemon
reports every legacy id it may have registered under previously
(`host`, `host` with `.local` stripped, and `host[-profile]` variants
for both). At register-time the server looks up each candidate row
scoped to (workspace, provider), re-points its agents and tasks onto
the new UUID-keyed row, records which legacy id was subsumed in the
new `legacy_daemon_id` column for audit, and deletes the stale row.
Result: users running `xxx.local`-keyed runtimes today transparently
land on the new UUID row on next daemon restart.

The hostname-prefix `MigrateAgentsToRuntime` / `daemon_id LIKE '...-%'`
compatibility shim is no longer needed and has been removed along with
the handler call that invoked it.
2026-04-17 02:40:11 +08:00
11 changed files with 1049 additions and 101 deletions

View File

@@ -28,6 +28,7 @@ const (
type Config struct {
ServerBaseURL string
DaemonID string
LegacyDaemonIDs []string // historical daemon_ids this machine may have registered under; reported at register time so the server can merge old runtime rows
DeviceName string
RuntimeName string
CLIVersion string // multica CLI version (e.g. "0.1.13")
@@ -187,15 +188,33 @@ func LoadConfig(overrides Overrides) (Config, error) {
// Profile
profile := overrides.Profile
// String overrides
daemonID := envOrDefault("MULTICA_DAEMON_ID", host)
// daemon_id resolution: override > env > persistent UUID on disk.
// The persistent UUID is written once to `~/.multica/daemon.id` and
// reused forever so hostname drift (.local suffix, system rename, mDNS
// state) no longer mints a new runtime identity. The file is machine-
// wide — intentionally not per-profile — so CLI and Desktop daemons on
// the same host share one identity and collapse into a single runtime
// row via the `(workspace_id, daemon_id, provider)` unique constraint.
// Callers may still pin a specific id via MULTICA_DAEMON_ID or the
// override field (e.g. for tests or embedded environments).
daemonID := strings.TrimSpace(os.Getenv("MULTICA_DAEMON_ID"))
if overrides.DaemonID != "" {
daemonID = overrides.DaemonID
}
// NOTE: daemon_id is intentionally stable (hostname or explicit override).
// The unique constraint (workspace_id, daemon_id, provider) already prevents
// collisions within the same workspace. Appending the profile name caused
// duplicate runtimes when users switched profiles.
if daemonID == "" {
persisted, err := EnsureDaemonID()
if err != nil {
return Config{}, fmt.Errorf("ensure daemon id: %w", err)
}
daemonID = persisted
}
// Historical daemon_ids derived from the current hostname/profile. The
// server uses these at register time to merge any pre-UUID runtime rows
// for this machine into the new UUID-keyed row and delete the stale ones.
legacyDaemonIDs := LegacyDaemonIDs(host, profile)
// Strip anything that collides with the resolved daemon_id (e.g. when
// the user explicitly pins MULTICA_DAEMON_ID=<hostname>).
legacyDaemonIDs = filterLegacyIDs(legacyDaemonIDs, daemonID)
deviceName := envOrDefault("MULTICA_DAEMON_DEVICE_NAME", host)
if overrides.DeviceName != "" {
@@ -258,6 +277,7 @@ func LoadConfig(overrides Overrides) (Config, error) {
return Config{
ServerBaseURL: serverBaseURL,
DaemonID: daemonID,
LegacyDaemonIDs: legacyDaemonIDs,
DeviceName: deviceName,
RuntimeName: runtimeName,
Profile: profile,

View File

@@ -225,12 +225,13 @@ func (d *Daemon) registerRuntimesForWorkspace(ctx context.Context, workspaceID s
}
req := map[string]any{
"workspace_id": workspaceID,
"daemon_id": d.cfg.DaemonID,
"device_name": d.cfg.DeviceName,
"cli_version": d.cfg.CLIVersion,
"launched_by": d.cfg.LaunchedBy,
"runtimes": runtimes,
"workspace_id": workspaceID,
"daemon_id": d.cfg.DaemonID,
"legacy_daemon_ids": d.cfg.LegacyDaemonIDs,
"device_name": d.cfg.DeviceName,
"cli_version": d.cfg.CLIVersion,
"launched_by": d.cfg.LaunchedBy,
"runtimes": runtimes,
}
resp, err := d.client.Register(ctx, req)

View File

@@ -0,0 +1,154 @@
package daemon
import (
"errors"
"fmt"
"os"
"path/filepath"
"strings"
"github.com/google/uuid"
"github.com/multica-ai/multica/server/internal/cli"
)
// daemonIDFileName is the per-machine file that stores this host's stable
// daemon identifier. Once created, the UUID inside is the daemon's identity
// forever — hostname changes, .local suffix drift, profile switches, and
// system renames no longer mint a new identity.
const daemonIDFileName = "daemon.id"
// EnsureDaemonID returns a stable UUID for this machine, persisting it to
// disk at `~/.multica/daemon.id` on first call.
//
// The file is intentionally NOT per-profile. A single machine has one daemon
// identity regardless of which profile the user is running under — the CLI
// daemon (default profile) and the Desktop daemon (its own `desktop-<host>`
// profile) must both register against the same runtime row, or the user ends
// up with two rows per provider per workspace every time they open the
// Desktop app after using the CLI (or vice versa). The unique constraint
// `(workspace_id, daemon_id, provider)` then naturally collapses them.
//
// Profiles still own their own config.json / log / token — only *identity*
// is machine-wide.
//
// If the file exists but is corrupt (unparseable), it is regenerated so the
// daemon can continue starting up instead of hard-failing.
func EnsureDaemonID() (string, error) {
dir, err := cli.ProfileDir("")
if err != nil {
return "", err
}
path := filepath.Join(dir, daemonIDFileName)
if data, err := os.ReadFile(path); err == nil {
if id := strings.TrimSpace(string(data)); id != "" {
if _, perr := uuid.Parse(id); perr == nil {
return id, nil
}
}
} else if !errors.Is(err, os.ErrNotExist) {
return "", fmt.Errorf("read daemon id file: %w", err)
}
if err := os.MkdirAll(dir, 0o755); err != nil {
return "", fmt.Errorf("create multica directory: %w", err)
}
id, err := uuid.NewV7()
if err != nil {
return "", fmt.Errorf("generate daemon id: %w", err)
}
tmp, err := os.CreateTemp(dir, ".daemon-*.id.tmp")
if err != nil {
return "", fmt.Errorf("create temp daemon id file: %w", err)
}
tmpPath := tmp.Name()
if _, err := tmp.WriteString(id.String() + "\n"); err != nil {
tmp.Close()
os.Remove(tmpPath)
return "", fmt.Errorf("write temp daemon id file: %w", err)
}
if err := tmp.Close(); err != nil {
os.Remove(tmpPath)
return "", fmt.Errorf("close temp daemon id file: %w", err)
}
if err := os.Chmod(tmpPath, 0o600); err != nil {
os.Remove(tmpPath)
return "", fmt.Errorf("chmod temp daemon id file: %w", err)
}
if err := os.Rename(tmpPath, path); err != nil {
os.Remove(tmpPath)
return "", fmt.Errorf("rename daemon id file: %w", err)
}
return id.String(), nil
}
// LegacyDaemonIDs returns the set of daemon_id values this machine may have
// previously registered under, before the switch to a persistent UUID. The
// server uses this list at registration time to merge old runtime rows into
// the new UUID-keyed row (moving agents/tasks then deleting the stale row).
//
// Three historical formats are covered:
//
// - pre-#906: "<hostname>-<profile>" (profile suffix, no .local strip)
// - pre-#1070: "<hostname>" (raw hostname, often ends in .local)
// - current: "<hostname>" with .local drift depending on system state
//
// .local drift is bidirectional — at different times os.Hostname() has
// returned both "foo" and "foo.local" on the same machine (mDNS state,
// system restart, login item order). So regardless of which form is current
// now, we always emit BOTH the bare and .local-suffixed variants so migration
// covers whichever form was persisted previously. Case drift is handled on
// the server side via case-insensitive lookup, so we don't also emit cased
// permutations here.
func LegacyDaemonIDs(hostname, profile string) []string {
host := strings.TrimSpace(hostname)
if host == "" {
return nil
}
stripped := strings.TrimSuffix(host, ".local")
dotLocal := stripped + ".local"
hostForms := []string{stripped, dotLocal}
candidates := make([]string, 0, len(hostForms)*2)
candidates = append(candidates, hostForms...)
if profile != "" {
for _, h := range hostForms {
candidates = append(candidates, h+"-"+profile)
}
}
seen := make(map[string]struct{}, len(candidates))
out := make([]string, 0, len(candidates))
for _, c := range candidates {
if c == "" {
continue
}
if _, ok := seen[c]; ok {
continue
}
seen[c] = struct{}{}
out = append(out, c)
}
return out
}
// filterLegacyIDs removes any entry equal to current (e.g. when the user
// explicitly pins MULTICA_DAEMON_ID to the hostname itself, there's nothing
// to migrate — the row is already keyed on the current id).
func filterLegacyIDs(ids []string, current string) []string {
if current == "" {
return ids
}
out := ids[:0]
for _, id := range ids {
if id == current {
continue
}
out = append(out, id)
}
return out
}

View File

@@ -0,0 +1,169 @@
package daemon
import (
"os"
"path/filepath"
"reflect"
"strings"
"testing"
"github.com/google/uuid"
)
func TestEnsureDaemonID_Persists(t *testing.T) {
home := t.TempDir()
t.Setenv("HOME", home)
first, err := EnsureDaemonID()
if err != nil {
t.Fatalf("EnsureDaemonID first call: %v", err)
}
if _, err := uuid.Parse(first); err != nil {
t.Fatalf("EnsureDaemonID returned non-UUID: %q", first)
}
path := filepath.Join(home, ".multica", "daemon.id")
data, err := os.ReadFile(path)
if err != nil {
t.Fatalf("daemon.id not written: %v", err)
}
if strings.TrimSpace(string(data)) != first {
t.Fatalf("file contents %q differ from returned UUID %q", data, first)
}
second, err := EnsureDaemonID()
if err != nil {
t.Fatalf("EnsureDaemonID second call: %v", err)
}
if second != first {
t.Fatalf("UUID changed on second call: %q → %q", first, second)
}
}
// TestEnsureDaemonID_MachineScopedAcrossProfiles pins the behavior the user
// needs: identity is machine-wide, not profile-scoped. The CLI daemon and the
// Desktop daemon (which runs under its own `desktop-<host>` profile) must end
// up with the same daemon_id when running on the same machine, so they
// register against a single runtime row instead of minting a new row every
// time the Desktop app opens alongside the CLI.
func TestEnsureDaemonID_MachineScopedAcrossProfiles(t *testing.T) {
home := t.TempDir()
t.Setenv("HOME", home)
cliID, err := EnsureDaemonID()
if err != nil {
t.Fatalf("first call: %v", err)
}
// Simulate a second daemon process (e.g. Desktop) starting up — it must
// read the same file, not mint a new UUID.
desktopID, err := EnsureDaemonID()
if err != nil {
t.Fatalf("second call: %v", err)
}
if cliID != desktopID {
t.Fatalf("machine identity drifted between calls: %s vs %s", cliID, desktopID)
}
// And no stray per-profile daemon.id should have been written.
if _, err := os.Stat(filepath.Join(home, ".multica", "profiles", "desktop-api.example.com", "daemon.id")); !os.IsNotExist(err) {
t.Fatalf("unexpected per-profile daemon.id present: err=%v", err)
}
}
func TestEnsureDaemonID_RegeneratesCorruptFile(t *testing.T) {
home := t.TempDir()
t.Setenv("HOME", home)
dir := filepath.Join(home, ".multica")
if err := os.MkdirAll(dir, 0o755); err != nil {
t.Fatalf("mkdir: %v", err)
}
path := filepath.Join(dir, "daemon.id")
if err := os.WriteFile(path, []byte("not-a-uuid"), 0o600); err != nil {
t.Fatalf("seed corrupt file: %v", err)
}
id, err := EnsureDaemonID()
if err != nil {
t.Fatalf("EnsureDaemonID: %v", err)
}
if _, err := uuid.Parse(id); err != nil {
t.Fatalf("expected valid UUID, got %q", id)
}
data, _ := os.ReadFile(path)
if strings.TrimSpace(string(data)) != id {
t.Fatalf("file not rewritten with new UUID")
}
}
func TestLegacyDaemonIDs(t *testing.T) {
cases := []struct {
name string
hostname string
profile string
want []string
}{
{
// Bare hostname now — but the DB may still hold the previously
// registered `.local` variant, so we must emit both.
name: "plain hostname, no profile",
hostname: "MacBook-Pro",
want: []string{"MacBook-Pro", "MacBook-Pro.local"},
},
{
// Dot-local hostname now — the stripped variant may be what the
// DB holds from a prior registration where .local was absent.
name: "dot-local hostname, no profile",
hostname: "MacBook-Pro.local",
want: []string{"MacBook-Pro", "MacBook-Pro.local"},
},
{
name: "plain hostname with profile",
hostname: "MacBook-Pro",
profile: "staging",
want: []string{
"MacBook-Pro",
"MacBook-Pro.local",
"MacBook-Pro-staging",
"MacBook-Pro.local-staging",
},
},
{
name: "dot-local hostname with profile",
hostname: "MacBook-Pro.local",
profile: "staging",
want: []string{
"MacBook-Pro",
"MacBook-Pro.local",
"MacBook-Pro-staging",
"MacBook-Pro.local-staging",
},
},
{
name: "empty hostname",
hostname: "",
want: nil,
},
{
// Case drift is handled on the server side (LOWER=LOWER match).
// We still emit the hostname in its current casing here; the SQL
// query normalizes both sides.
name: "mixed case hostname preserved as-is",
hostname: "Jiayuans-MacBook-Pro.local",
want: []string{
"Jiayuans-MacBook-Pro",
"Jiayuans-MacBook-Pro.local",
},
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := LegacyDaemonIDs(tc.hostname, tc.profile)
if !reflect.DeepEqual(got, tc.want) {
t.Fatalf("LegacyDaemonIDs(%q, %q) = %v, want %v", tc.hostname, tc.profile, got, tc.want)
}
})
}
}

View File

@@ -103,10 +103,15 @@ func (h *Handler) verifyDaemonWorkspaceAccess(r *http.Request, workspaceID strin
type DaemonRegisterRequest struct {
WorkspaceID string `json:"workspace_id"`
DaemonID string `json:"daemon_id"`
DeviceName string `json:"device_name"`
CLIVersion string `json:"cli_version"` // multica CLI version
LaunchedBy string `json:"launched_by"` // "desktop" when spawned by the Electron app
Runtimes []struct {
// LegacyDaemonIDs lists prior hostname-derived daemon_ids this machine
// may have registered under before switching to a persistent UUID. The
// handler merges any matching runtime rows into the new row so agents
// and tasks keep working without manual intervention.
LegacyDaemonIDs []string `json:"legacy_daemon_ids"`
DeviceName string `json:"device_name"`
CLIVersion string `json:"cli_version"` // multica CLI version
LaunchedBy string `json:"launched_by"` // "desktop" when spawned by the Electron app
Runtimes []struct {
Name string `json:"name"`
Type string `json:"type"`
Version string `json:"version"` // agent CLI version (claude/codex)
@@ -272,26 +277,12 @@ func (h *Handler) DaemonRegister(w http.ResponseWriter, r *http.Request) {
return
}
// Migrate agents from old offline runtimes on the same machine to the
// newly registered runtime. Uses the runtime's owner_id (preserved via
// COALESCE on upsert) so migration works with both PAT and daemon tokens.
// Scoped by daemon_id prefix so that only old profile-suffixed runtimes
// (e.g. "hostname-staging") from this machine are affected.
effectiveOwnerID := registered.OwnerID
if effectiveOwnerID.Valid {
migrated, err := h.Queries.MigrateAgentsToRuntime(r.Context(), db.MigrateAgentsToRuntimeParams{
NewRuntimeID: registered.ID,
WorkspaceID: parseUUID(req.WorkspaceID),
Provider: provider,
OwnerID: effectiveOwnerID,
DaemonIDPrefix: strToText(req.DaemonID),
})
if err != nil {
slog.Warn("failed to migrate agents to new runtime", "runtime_id", uuidToString(registered.ID), "error", err)
} else if migrated > 0 {
slog.Info("migrated agents to new runtime", "runtime_id", uuidToString(registered.ID), "provider", provider, "migrated_count", migrated)
}
}
// Seamless migration from the previous hostname-derived identity. The
// daemon sends every legacy daemon_id it may have registered under
// (e.g. "host.local", "host", "host-staging"); for each match we
// reassign agents + tasks onto the new UUID-keyed row, then delete
// the stale row so there's only ever one runtime per machine.
h.mergeLegacyRuntimes(r, registered, provider, req.LegacyDaemonIDs)
resp = append(resp, runtimeToResponse(registered))
}
@@ -310,6 +301,88 @@ func (h *Handler) DaemonRegister(w http.ResponseWriter, r *http.Request) {
})
}
// mergeLegacyRuntimes folds every runtime row keyed on a prior hostname-derived
// daemon_id into the newly registered UUID-keyed row. For each legacy id the
// lookup is case-insensitive and returns *all* matching rows — case-only drift
// may have already minted duplicates historically (e.g. `Foo.local` AND
// `foo.local` coexisting), and we need to consolidate every one of them, not
// just the first. Per match we reassign agents and tasks, record the legacy
// id on the new row for audit, then delete the stale row.
//
// Scoping by (workspace_id, provider) is sufficient since provider is single-
// runtime-per-daemon; `unique (workspace_id, daemon_id, provider)` prevents
// any two *exact* matches but the `LOWER(...)` comparison crosses that bound
// precisely when case-duplicate rows exist — which is the bug we're fixing.
// We also dedupe across legacy ids so overlapping candidates (e.g. `foo` and
// `foo.local` both resolving to the same stored row) don't double-process.
func (h *Handler) mergeLegacyRuntimes(r *http.Request, registered db.AgentRuntime, provider string, legacyIDs []string) {
newID := uuidToString(registered.ID)
merged := make(map[string]struct{})
for _, legacyID := range legacyIDs {
legacyID = strings.TrimSpace(legacyID)
if legacyID == "" {
continue
}
matches, err := h.Queries.FindLegacyRuntimesByDaemonID(r.Context(), db.FindLegacyRuntimesByDaemonIDParams{
WorkspaceID: registered.WorkspaceID,
Provider: provider,
DaemonID: legacyID,
})
if err != nil {
slog.Warn("legacy runtime merge: lookup failed", "legacy_daemon_id", legacyID, "error", err)
continue
}
for _, old := range matches {
oldID := uuidToString(old.ID)
if oldID == newID {
continue
}
if _, seen := merged[oldID]; seen {
continue
}
merged[oldID] = struct{}{}
agents, err := h.Queries.ReassignAgentsToRuntime(r.Context(), db.ReassignAgentsToRuntimeParams{
NewRuntimeID: registered.ID,
OldRuntimeID: old.ID,
})
if err != nil {
slog.Warn("legacy runtime merge: reassign agents failed", "legacy_daemon_id", legacyID, "old_runtime_id", oldID, "new_runtime_id", newID, "error", err)
continue
}
tasks, err := h.Queries.ReassignTasksToRuntime(r.Context(), db.ReassignTasksToRuntimeParams{
NewRuntimeID: registered.ID,
OldRuntimeID: old.ID,
})
if err != nil {
slog.Warn("legacy runtime merge: reassign tasks failed", "legacy_daemon_id", legacyID, "old_runtime_id", oldID, "new_runtime_id", newID, "error", err)
continue
}
if err := h.Queries.RecordRuntimeLegacyDaemonID(r.Context(), db.RecordRuntimeLegacyDaemonIDParams{
ID: registered.ID,
LegacyDaemonID: strToText(legacyID),
}); err != nil {
slog.Warn("legacy runtime merge: record legacy daemon_id failed", "legacy_daemon_id", legacyID, "error", err)
}
if err := h.Queries.DeleteAgentRuntime(r.Context(), old.ID); err != nil {
slog.Warn("legacy runtime merge: delete old runtime failed", "old_runtime_id", oldID, "error", err)
continue
}
slog.Info("legacy runtime merged",
"legacy_daemon_id", legacyID,
"old_runtime_id", oldID,
"new_runtime_id", newID,
"provider", provider,
"agents_reassigned", agents,
"tasks_reassigned", tasks,
)
}
}
}
func (h *Handler) GetDaemonWorkspaceRepos(w http.ResponseWriter, r *http.Request) {
workspaceID := strings.TrimSpace(chi.URLParam(r, "workspaceId"))
if !h.requireDaemonWorkspaceAccess(w, r, workspaceID) {

View File

@@ -645,6 +645,414 @@ func TestGetDaemonWorkspaceRepos_VersionIgnoresOrderAndDescription(t *testing.T)
}
}
// TestDaemonRegister_MergesLegacyDaemonIDRuntime simulates the migration path
// for an existing user whose runtime was previously keyed on a hostname-derived
// daemon_id (e.g. "MacBook-Pro.local"). After the daemon switches to a stable
// UUID, the registration payload lists the old id under `legacy_daemon_ids`.
// The server must:
//
// - reassign every agent pointing at the old runtime row to the new row,
// - reassign every task (agent_task_queue.runtime_id) onto the new row,
// - delete the stale old row so there's exactly one runtime per machine,
// - record the legacy daemon_id on the new row for traceability.
//
// This is the acceptance path from MUL-975: hostname drift must no longer
// orphan agents on stale runtime rows.
func TestDaemonRegister_MergesLegacyDaemonIDRuntime(t *testing.T) {
if testHandler == nil {
t.Skip("database not available")
}
ctx := context.Background()
const legacyDaemonID = "TestMachine.local"
const newDaemonID = "0192a7a0-9ab3-7c3f-9f1c-4a6fe8c4e801"
// Seed a legacy runtime row keyed on the hostname-derived id.
var legacyRuntimeID string
if err := testPool.QueryRow(ctx, `
INSERT INTO agent_runtime (workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, owner_id, last_seen_at)
VALUES ($1, $2, 'legacy-runtime', 'local', 'claude', 'offline', 'TestMachine.local', '{}'::jsonb, $3, now() - interval '1 hour')
RETURNING id
`, testWorkspaceID, legacyDaemonID, testUserID).Scan(&legacyRuntimeID); err != nil {
t.Fatalf("seed legacy runtime: %v", err)
}
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, legacyRuntimeID)
})
// An agent bound to the legacy runtime.
var legacyAgentID string
if err := testPool.QueryRow(ctx, `
INSERT INTO agent (workspace_id, name, runtime_mode, runtime_config, runtime_id, visibility, max_concurrent_tasks)
VALUES ($1, 'legacy-agent', 'local', '{}'::jsonb, $2, 'workspace', 1)
RETURNING id
`, testWorkspaceID, legacyRuntimeID).Scan(&legacyAgentID); err != nil {
t.Fatalf("seed legacy agent: %v", err)
}
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent WHERE id = $1`, legacyAgentID)
})
// An issue + task also bound to the legacy runtime (tasks have ON DELETE
// CASCADE, so without reassignment deleting the legacy row would silently
// drop historical tasks).
var legacyIssueID, legacyTaskID string
if err := testPool.QueryRow(ctx, `
INSERT INTO issue (workspace_id, title, status, priority, creator_id, creator_type, number, position)
VALUES ($1, 'legacy-task-owner', 'todo', 'medium', $2, 'member', 97501, 0)
RETURNING id
`, testWorkspaceID, testUserID).Scan(&legacyIssueID); err != nil {
t.Fatalf("seed legacy issue: %v", err)
}
t.Cleanup(func() { testPool.Exec(context.Background(), `DELETE FROM issue WHERE id = $1`, legacyIssueID) })
if err := testPool.QueryRow(ctx, `
INSERT INTO agent_task_queue (agent_id, issue_id, status, runtime_id)
VALUES ($1, $2, 'completed', $3)
RETURNING id
`, legacyAgentID, legacyIssueID, legacyRuntimeID).Scan(&legacyTaskID); err != nil {
t.Fatalf("seed legacy task: %v", err)
}
t.Cleanup(func() { testPool.Exec(context.Background(), `DELETE FROM agent_task_queue WHERE id = $1`, legacyTaskID) })
// Register under the new stable UUID, declaring the prior hostname-derived
// id as legacy. The handler should merge the legacy row into the new one.
w := httptest.NewRecorder()
req := newRequest("POST", "/api/daemon/register", map[string]any{
"workspace_id": testWorkspaceID,
"daemon_id": newDaemonID,
"legacy_daemon_ids": []string{legacyDaemonID},
"device_name": "TestMachine",
"runtimes": []map[string]any{
{"name": "test-runtime", "type": "claude", "version": "1.0.0", "status": "online"},
},
})
testHandler.DaemonRegister(w, req)
if w.Code != http.StatusOK {
t.Fatalf("DaemonRegister: expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]any
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("decode response: %v", err)
}
runtimes := resp["runtimes"].([]any)
newRuntimeID := runtimes[0].(map[string]any)["id"].(string)
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, newRuntimeID)
})
if newRuntimeID == legacyRuntimeID {
t.Fatalf("expected a new runtime row, got the legacy id back")
}
// Agent should now point at the new runtime.
var agentRuntimeID string
if err := testPool.QueryRow(ctx, `SELECT runtime_id FROM agent WHERE id = $1`, legacyAgentID).Scan(&agentRuntimeID); err != nil {
t.Fatalf("read agent runtime_id: %v", err)
}
if agentRuntimeID != newRuntimeID {
t.Fatalf("agent not reassigned: got runtime_id=%s, want %s", agentRuntimeID, newRuntimeID)
}
// Task should be reassigned (not dropped).
var taskRuntimeID string
if err := testPool.QueryRow(ctx, `SELECT runtime_id FROM agent_task_queue WHERE id = $1`, legacyTaskID).Scan(&taskRuntimeID); err != nil {
t.Fatalf("read task runtime_id: %v", err)
}
if taskRuntimeID != newRuntimeID {
t.Fatalf("task not reassigned: got runtime_id=%s, want %s", taskRuntimeID, newRuntimeID)
}
// Legacy runtime row must be gone — no more "online + offline" duplicates
// for the same machine.
var legacyCount int
if err := testPool.QueryRow(ctx, `SELECT count(*) FROM agent_runtime WHERE id = $1`, legacyRuntimeID).Scan(&legacyCount); err != nil {
t.Fatalf("count legacy runtime: %v", err)
}
if legacyCount != 0 {
t.Fatalf("expected legacy runtime row to be deleted, still present")
}
// New row should record which legacy id it subsumed, for debug/audit.
var legacyTrace *string
if err := testPool.QueryRow(ctx, `SELECT legacy_daemon_id FROM agent_runtime WHERE id = $1`, newRuntimeID).Scan(&legacyTrace); err != nil {
t.Fatalf("read legacy_daemon_id: %v", err)
}
if legacyTrace == nil || *legacyTrace != legacyDaemonID {
t.Fatalf("expected legacy_daemon_id=%q, got %v", legacyDaemonID, legacyTrace)
}
}
// TestDaemonRegister_MergesLegacyDaemonIDRuntime_ReverseDotLocal covers the
// direction missed by the initial implementation: the stored runtime row is
// `host` (no `.local`) but the daemon's current `os.Hostname()` now returns
// `host.local`. The daemon must emit the bare variant as a legacy candidate
// and the server must match it.
func TestDaemonRegister_MergesLegacyDaemonIDRuntime_ReverseDotLocal(t *testing.T) {
if testHandler == nil {
t.Skip("database not available")
}
ctx := context.Background()
const legacyDaemonID = "ReverseDotLocalHost" // stored without .local
const emittedLegacyID = "ReverseDotLocalHost.local" // daemon now reports with .local
const newDaemonID = "0192a7b0-0011-7ee9-9c21-30a5bcf86aa2"
var legacyRuntimeID string
if err := testPool.QueryRow(ctx, `
INSERT INTO agent_runtime (workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, owner_id, last_seen_at)
VALUES ($1, $2, 'legacy-runtime-reverse', 'local', 'claude', 'offline', '', '{}'::jsonb, $3, now())
RETURNING id
`, testWorkspaceID, legacyDaemonID, testUserID).Scan(&legacyRuntimeID); err != nil {
t.Fatalf("seed legacy runtime: %v", err)
}
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, legacyRuntimeID)
})
w := httptest.NewRecorder()
req := newRequest("POST", "/api/daemon/register", map[string]any{
"workspace_id": testWorkspaceID,
"daemon_id": newDaemonID,
"legacy_daemon_ids": []string{"ReverseDotLocalHost", emittedLegacyID},
"device_name": "ReverseDotLocalHost",
"runtimes": []map[string]any{
{"name": "reverse-runtime", "type": "claude", "version": "1.0.0", "status": "online"},
},
})
testHandler.DaemonRegister(w, req)
if w.Code != http.StatusOK {
t.Fatalf("DaemonRegister: expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]any
json.NewDecoder(w.Body).Decode(&resp)
newRuntimeID := resp["runtimes"].([]any)[0].(map[string]any)["id"].(string)
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, newRuntimeID)
})
var legacyCount int
if err := testPool.QueryRow(ctx, `SELECT count(*) FROM agent_runtime WHERE id = $1`, legacyRuntimeID).Scan(&legacyCount); err != nil {
t.Fatalf("count legacy runtime: %v", err)
}
if legacyCount != 0 {
t.Fatalf("expected legacy row to be merged and deleted, still present")
}
}
// TestDaemonRegister_MergesLegacyDaemonIDRuntime_CaseDrift verifies that
// case-only drift in os.Hostname() output (e.g. `Jiayuans-MacBook-Pro.local`
// vs `jiayuans-macbook-pro.local`) still merges the legacy row. The daemon
// emits the id in its current casing; the server-side lookup uses LOWER() on
// both sides so stored and emitted casings can differ without orphaning.
func TestDaemonRegister_MergesLegacyDaemonIDRuntime_CaseDrift(t *testing.T) {
if testHandler == nil {
t.Skip("database not available")
}
ctx := context.Background()
const storedDaemonID = "Jiayuans-MacBook-Pro.local" // DB has original mixed case
const emittedLegacyID = "jiayuans-macbook-pro.local" // Daemon now reports lowercased
const newDaemonID = "0192a7b0-0022-7ee9-9c21-30a5bcf86aa3"
var legacyRuntimeID string
if err := testPool.QueryRow(ctx, `
INSERT INTO agent_runtime (workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, owner_id, last_seen_at)
VALUES ($1, $2, 'legacy-runtime-case', 'local', 'claude', 'offline', '', '{}'::jsonb, $3, now())
RETURNING id
`, testWorkspaceID, storedDaemonID, testUserID).Scan(&legacyRuntimeID); err != nil {
t.Fatalf("seed legacy runtime: %v", err)
}
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, legacyRuntimeID)
})
w := httptest.NewRecorder()
req := newRequest("POST", "/api/daemon/register", map[string]any{
"workspace_id": testWorkspaceID,
"daemon_id": newDaemonID,
"legacy_daemon_ids": []string{emittedLegacyID},
"device_name": "jiayuans-macbook-pro",
"runtimes": []map[string]any{
{"name": "case-drift-runtime", "type": "claude", "version": "1.0.0", "status": "online"},
},
})
testHandler.DaemonRegister(w, req)
if w.Code != http.StatusOK {
t.Fatalf("DaemonRegister: expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]any
json.NewDecoder(w.Body).Decode(&resp)
newRuntimeID := resp["runtimes"].([]any)[0].(map[string]any)["id"].(string)
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, newRuntimeID)
})
var legacyCount int
if err := testPool.QueryRow(ctx, `SELECT count(*) FROM agent_runtime WHERE id = $1`, legacyRuntimeID).Scan(&legacyCount); err != nil {
t.Fatalf("count legacy runtime: %v", err)
}
if legacyCount != 0 {
t.Fatalf("expected case-drift legacy row to be merged and deleted, still present")
}
var legacyTrace *string
if err := testPool.QueryRow(ctx, `SELECT legacy_daemon_id FROM agent_runtime WHERE id = $1`, newRuntimeID).Scan(&legacyTrace); err != nil {
t.Fatalf("read legacy_daemon_id: %v", err)
}
if legacyTrace == nil || *legacyTrace != emittedLegacyID {
t.Fatalf("expected legacy_daemon_id trace = %q, got %v", emittedLegacyID, legacyTrace)
}
}
// TestDaemonRegister_MergesAllCaseDuplicateLegacyRuntimes covers the case
// where the DB already holds *two* legacy runtime rows that differ only in
// casing (e.g. `Jiayuans-MacBook-Pro.local` AND `jiayuans-macbook-pro.local`
// coexist under the same workspace+provider because earlier hostname drift
// already minted a duplicate). A single-row lookup would merge only one of
// them and leave the other orphaned; the lookup must return every row whose
// daemon_id case-insensitively matches and the handler must consolidate them
// all. This is the acceptance-standard path: after registration there must
// not be two runtime rows for the same machine.
func TestDaemonRegister_MergesAllCaseDuplicateLegacyRuntimes(t *testing.T) {
if testHandler == nil {
t.Skip("database not available")
}
ctx := context.Background()
const storedUpperID = "DupHost.local"
const storedLowerID = "duphost.local"
const newDaemonID = "0192a7b0-0033-7ee9-9c21-30a5bcf86aa4"
var legacyUpperID, legacyLowerID string
if err := testPool.QueryRow(ctx, `
INSERT INTO agent_runtime (workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, owner_id, last_seen_at)
VALUES ($1, $2, 'legacy-upper', 'local', 'claude', 'offline', '', '{}'::jsonb, $3, now() - interval '2 hours')
RETURNING id
`, testWorkspaceID, storedUpperID, testUserID).Scan(&legacyUpperID); err != nil {
t.Fatalf("seed upper-case legacy runtime: %v", err)
}
t.Cleanup(func() { testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, legacyUpperID) })
if err := testPool.QueryRow(ctx, `
INSERT INTO agent_runtime (workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, owner_id, last_seen_at)
VALUES ($1, $2, 'legacy-lower', 'local', 'claude', 'offline', '', '{}'::jsonb, $3, now() - interval '1 hour')
RETURNING id
`, testWorkspaceID, storedLowerID, testUserID).Scan(&legacyLowerID); err != nil {
t.Fatalf("seed lower-case legacy runtime: %v", err)
}
t.Cleanup(func() { testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, legacyLowerID) })
// Bind one agent to each legacy row to verify both sides get reassigned.
var upperAgentID, lowerAgentID string
if err := testPool.QueryRow(ctx, `
INSERT INTO agent (workspace_id, name, runtime_mode, runtime_config, runtime_id, visibility, max_concurrent_tasks)
VALUES ($1, 'dup-agent-upper', 'local', '{}'::jsonb, $2, 'workspace', 1)
RETURNING id
`, testWorkspaceID, legacyUpperID).Scan(&upperAgentID); err != nil {
t.Fatalf("seed upper agent: %v", err)
}
t.Cleanup(func() { testPool.Exec(context.Background(), `DELETE FROM agent WHERE id = $1`, upperAgentID) })
if err := testPool.QueryRow(ctx, `
INSERT INTO agent (workspace_id, name, runtime_mode, runtime_config, runtime_id, visibility, max_concurrent_tasks)
VALUES ($1, 'dup-agent-lower', 'local', '{}'::jsonb, $2, 'workspace', 1)
RETURNING id
`, testWorkspaceID, legacyLowerID).Scan(&lowerAgentID); err != nil {
t.Fatalf("seed lower agent: %v", err)
}
t.Cleanup(func() { testPool.Exec(context.Background(), `DELETE FROM agent WHERE id = $1`, lowerAgentID) })
w := httptest.NewRecorder()
req := newRequest("POST", "/api/daemon/register", map[string]any{
"workspace_id": testWorkspaceID,
"daemon_id": newDaemonID,
"legacy_daemon_ids": []string{storedLowerID}, // a single candidate must resolve both stored casings
"device_name": "DupHost",
"runtimes": []map[string]any{
{"name": "dup-runtime", "type": "claude", "version": "1.0.0", "status": "online"},
},
})
testHandler.DaemonRegister(w, req)
if w.Code != http.StatusOK {
t.Fatalf("DaemonRegister: expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]any
json.NewDecoder(w.Body).Decode(&resp)
newRuntimeID := resp["runtimes"].([]any)[0].(map[string]any)["id"].(string)
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, newRuntimeID)
})
// Both case-duplicate legacy rows must be gone — not just one.
var stillPresent int
if err := testPool.QueryRow(ctx, `
SELECT count(*) FROM agent_runtime WHERE id = ANY($1)
`, []string{legacyUpperID, legacyLowerID}).Scan(&stillPresent); err != nil {
t.Fatalf("count legacy runtimes: %v", err)
}
if stillPresent != 0 {
t.Fatalf("expected both case-duplicate legacy rows merged and deleted, %d still present", stillPresent)
}
// Both agents must point at the new runtime.
for _, agentID := range []string{upperAgentID, lowerAgentID} {
var runtimeID string
if err := testPool.QueryRow(ctx, `SELECT runtime_id FROM agent WHERE id = $1`, agentID).Scan(&runtimeID); err != nil {
t.Fatalf("read agent runtime_id: %v", err)
}
if runtimeID != newRuntimeID {
t.Fatalf("agent %s not reassigned: runtime_id=%s, want %s", agentID, runtimeID, newRuntimeID)
}
}
}
// TestDaemonRegister_LegacyIDNoMatchIsNoop guards the common case where the
// daemon sends legacy candidates but no matching row exists (e.g. first
// registration on a fresh machine). Registration must still succeed, the new
// row must not have a spurious legacy_daemon_id recorded, and no unrelated
// rows may be touched.
func TestDaemonRegister_LegacyIDNoMatchIsNoop(t *testing.T) {
if testHandler == nil {
t.Skip("database not available")
}
ctx := context.Background()
w := httptest.NewRecorder()
req := newRequest("POST", "/api/daemon/register", map[string]any{
"workspace_id": testWorkspaceID,
"daemon_id": "0192a7a1-5e3c-7be9-9a7d-6e0f1cb3deab",
"legacy_daemon_ids": []string{"NeverSeenHost", "NeverSeenHost.local"},
"device_name": "NeverSeenHost",
"runtimes": []map[string]any{
{"name": "fresh-runtime", "type": "claude", "version": "1.0.0", "status": "online"},
},
})
testHandler.DaemonRegister(w, req)
if w.Code != http.StatusOK {
t.Fatalf("DaemonRegister: expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]any
json.NewDecoder(w.Body).Decode(&resp)
runtimeID := resp["runtimes"].([]any)[0].(map[string]any)["id"].(string)
t.Cleanup(func() {
testPool.Exec(context.Background(), `DELETE FROM agent_runtime WHERE id = $1`, runtimeID)
})
var legacy *string
if err := testPool.QueryRow(ctx, `SELECT legacy_daemon_id FROM agent_runtime WHERE id = $1`, runtimeID).Scan(&legacy); err != nil {
t.Fatalf("read legacy_daemon_id: %v", err)
}
if legacy != nil {
t.Fatalf("expected legacy_daemon_id to stay NULL when no merge occurred, got %q", *legacy)
}
}
// Regression test for #1224: tasks linked only via AutopilotRunID (run_only
// autopilots) must resolve to the autopilot's workspace. Before the fix,
// resolveTaskWorkspaceID fell through and every StartTask call returned 404.

View File

@@ -0,0 +1,2 @@
ALTER TABLE agent_runtime
DROP COLUMN IF EXISTS legacy_daemon_id;

View File

@@ -0,0 +1,6 @@
-- Runtime identity is moving from `os.Hostname()` to a persistent daemon UUID.
-- `legacy_daemon_id` records the most recent hostname-derived daemon_id that
-- was merged into this row so the previous identity remains traceable for
-- debugging and audit after the old row is deleted.
ALTER TABLE agent_runtime
ADD COLUMN legacy_daemon_id TEXT;

View File

@@ -42,19 +42,20 @@ type Agent struct {
}
type AgentRuntime struct {
ID pgtype.UUID `json:"id"`
WorkspaceID pgtype.UUID `json:"workspace_id"`
DaemonID pgtype.Text `json:"daemon_id"`
Name string `json:"name"`
RuntimeMode string `json:"runtime_mode"`
Provider string `json:"provider"`
Status string `json:"status"`
DeviceInfo string `json:"device_info"`
Metadata []byte `json:"metadata"`
LastSeenAt pgtype.Timestamptz `json:"last_seen_at"`
CreatedAt pgtype.Timestamptz `json:"created_at"`
UpdatedAt pgtype.Timestamptz `json:"updated_at"`
OwnerID pgtype.UUID `json:"owner_id"`
ID pgtype.UUID `json:"id"`
WorkspaceID pgtype.UUID `json:"workspace_id"`
DaemonID pgtype.Text `json:"daemon_id"`
Name string `json:"name"`
RuntimeMode string `json:"runtime_mode"`
Provider string `json:"provider"`
Status string `json:"status"`
DeviceInfo string `json:"device_info"`
Metadata []byte `json:"metadata"`
LastSeenAt pgtype.Timestamptz `json:"last_seen_at"`
CreatedAt pgtype.Timestamptz `json:"created_at"`
UpdatedAt pgtype.Timestamptz `json:"updated_at"`
OwnerID pgtype.UUID `json:"owner_id"`
LegacyDaemonID pgtype.Text `json:"legacy_daemon_id"`
}
type AgentSkill struct {

View File

@@ -114,8 +114,71 @@ func (q *Queries) FailTasksForOfflineRuntimes(ctx context.Context) ([]FailTasksF
return items, nil
}
const findLegacyRuntimesByDaemonID = `-- name: FindLegacyRuntimesByDaemonID :many
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id, legacy_daemon_id FROM agent_runtime
WHERE workspace_id = $1
AND provider = $2
AND LOWER(daemon_id) = LOWER($3)
`
type FindLegacyRuntimesByDaemonIDParams struct {
WorkspaceID pgtype.UUID `json:"workspace_id"`
Provider string `json:"provider"`
DaemonID string `json:"daemon_id"`
}
// Looks up runtime rows keyed on a prior (hostname-derived) daemon_id. Used
// at register-time to find rows owned by the same machine under its old
// identity so agents/tasks can be re-pointed at the new UUID-keyed row.
//
// Comparison is case-insensitive because os.Hostname() has been observed to
// return different casings on the same machine (e.g. `Jiayuans-MacBook-Pro`
// vs `jiayuans-macbook-pro`) across reboots/mDNS state changes. A case-
// sensitive `=` would strand the old row; LOWER() on both sides handles drift
// without forcing the daemon to enumerate cased permutations.
//
// Returns many rather than one because case drift may have already minted
// duplicate rows historically (e.g. `Foo.local` AND `foo.local` under the
// same workspace+provider). A single-row lookup would consolidate only one
// of them and leave the rest orphaned. Callers must merge every returned
// row into the new UUID-keyed runtime.
func (q *Queries) FindLegacyRuntimesByDaemonID(ctx context.Context, arg FindLegacyRuntimesByDaemonIDParams) ([]AgentRuntime, error) {
rows, err := q.db.Query(ctx, findLegacyRuntimesByDaemonID, arg.WorkspaceID, arg.Provider, arg.DaemonID)
if err != nil {
return nil, err
}
defer rows.Close()
items := []AgentRuntime{}
for rows.Next() {
var i AgentRuntime
if err := rows.Scan(
&i.ID,
&i.WorkspaceID,
&i.DaemonID,
&i.Name,
&i.RuntimeMode,
&i.Provider,
&i.Status,
&i.DeviceInfo,
&i.Metadata,
&i.LastSeenAt,
&i.CreatedAt,
&i.UpdatedAt,
&i.OwnerID,
&i.LegacyDaemonID,
); err != nil {
return nil, err
}
items = append(items, i)
}
if err := rows.Err(); err != nil {
return nil, err
}
return items, nil
}
const getAgentRuntime = `-- name: GetAgentRuntime :one
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id FROM agent_runtime
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id, legacy_daemon_id FROM agent_runtime
WHERE id = $1
`
@@ -136,12 +199,13 @@ func (q *Queries) GetAgentRuntime(ctx context.Context, id pgtype.UUID) (AgentRun
&i.CreatedAt,
&i.UpdatedAt,
&i.OwnerID,
&i.LegacyDaemonID,
)
return i, err
}
const getAgentRuntimeForWorkspace = `-- name: GetAgentRuntimeForWorkspace :one
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id FROM agent_runtime
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id, legacy_daemon_id FROM agent_runtime
WHERE id = $1 AND workspace_id = $2
`
@@ -167,12 +231,13 @@ func (q *Queries) GetAgentRuntimeForWorkspace(ctx context.Context, arg GetAgentR
&i.CreatedAt,
&i.UpdatedAt,
&i.OwnerID,
&i.LegacyDaemonID,
)
return i, err
}
const listAgentRuntimes = `-- name: ListAgentRuntimes :many
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id FROM agent_runtime
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id, legacy_daemon_id FROM agent_runtime
WHERE workspace_id = $1
ORDER BY created_at ASC
`
@@ -200,6 +265,7 @@ func (q *Queries) ListAgentRuntimes(ctx context.Context, workspaceID pgtype.UUID
&i.CreatedAt,
&i.UpdatedAt,
&i.OwnerID,
&i.LegacyDaemonID,
); err != nil {
return nil, err
}
@@ -212,7 +278,7 @@ func (q *Queries) ListAgentRuntimes(ctx context.Context, workspaceID pgtype.UUID
}
const listAgentRuntimesByOwner = `-- name: ListAgentRuntimesByOwner :many
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id FROM agent_runtime
SELECT id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id, legacy_daemon_id FROM agent_runtime
WHERE workspace_id = $1 AND owner_id = $2
ORDER BY created_at ASC
`
@@ -245,6 +311,7 @@ func (q *Queries) ListAgentRuntimesByOwner(ctx context.Context, arg ListAgentRun
&i.CreatedAt,
&i.UpdatedAt,
&i.OwnerID,
&i.LegacyDaemonID,
); err != nil {
return nil, err
}
@@ -289,48 +356,68 @@ func (q *Queries) MarkStaleRuntimesOffline(ctx context.Context, staleSeconds flo
return items, nil
}
const migrateAgentsToRuntime = `-- name: MigrateAgentsToRuntime :execrows
const reassignAgentsToRuntime = `-- name: ReassignAgentsToRuntime :execrows
UPDATE agent
SET runtime_id = $1
WHERE runtime_id IN (
SELECT ar.id FROM agent_runtime ar
WHERE ar.workspace_id = $2
AND ar.provider = $3
AND ar.owner_id = $4
AND ar.id != $1
AND ar.status = 'offline'
AND ar.daemon_id LIKE $5 || '-%'
)
WHERE runtime_id = $2
`
type MigrateAgentsToRuntimeParams struct {
NewRuntimeID pgtype.UUID `json:"new_runtime_id"`
WorkspaceID pgtype.UUID `json:"workspace_id"`
Provider string `json:"provider"`
OwnerID pgtype.UUID `json:"owner_id"`
DaemonIDPrefix pgtype.Text `json:"daemon_id_prefix"`
type ReassignAgentsToRuntimeParams struct {
NewRuntimeID pgtype.UUID `json:"new_runtime_id"`
OldRuntimeID pgtype.UUID `json:"old_runtime_id"`
}
// Migrates agents from stale offline runtimes to the newly registered runtime.
// Only migrates from runtimes that match the same workspace, provider, owner,
// AND whose daemon_id starts with the current daemon_id followed by '-'.
// This scopes migration to old profile-suffixed runtimes from the same machine
// (e.g. "MacBook-staging" matches daemon_id_prefix "MacBook") without touching
// runtimes from other machines belonging to the same user.
func (q *Queries) MigrateAgentsToRuntime(ctx context.Context, arg MigrateAgentsToRuntimeParams) (int64, error) {
result, err := q.db.Exec(ctx, migrateAgentsToRuntime,
arg.NewRuntimeID,
arg.WorkspaceID,
arg.Provider,
arg.OwnerID,
arg.DaemonIDPrefix,
)
// Re-points every agent referencing old_runtime_id at new_runtime_id.
func (q *Queries) ReassignAgentsToRuntime(ctx context.Context, arg ReassignAgentsToRuntimeParams) (int64, error) {
result, err := q.db.Exec(ctx, reassignAgentsToRuntime, arg.NewRuntimeID, arg.OldRuntimeID)
if err != nil {
return 0, err
}
return result.RowsAffected(), nil
}
const reassignTasksToRuntime = `-- name: ReassignTasksToRuntime :execrows
UPDATE agent_task_queue
SET runtime_id = $1
WHERE runtime_id = $2
`
type ReassignTasksToRuntimeParams struct {
NewRuntimeID pgtype.UUID `json:"new_runtime_id"`
OldRuntimeID pgtype.UUID `json:"old_runtime_id"`
}
// Re-points every queued/running/completed task referencing old_runtime_id.
// Required before deleting the old runtime row because agent_task_queue has
// an ON DELETE CASCADE FK that would otherwise drop historical tasks.
func (q *Queries) ReassignTasksToRuntime(ctx context.Context, arg ReassignTasksToRuntimeParams) (int64, error) {
result, err := q.db.Exec(ctx, reassignTasksToRuntime, arg.NewRuntimeID, arg.OldRuntimeID)
if err != nil {
return 0, err
}
return result.RowsAffected(), nil
}
const recordRuntimeLegacyDaemonID = `-- name: RecordRuntimeLegacyDaemonID :exec
UPDATE agent_runtime
SET legacy_daemon_id = COALESCE(legacy_daemon_id, $2)
WHERE id = $1
`
type RecordRuntimeLegacyDaemonIDParams struct {
ID pgtype.UUID `json:"id"`
LegacyDaemonID pgtype.Text `json:"legacy_daemon_id"`
}
// Remembers the most recent hostname-derived daemon_id that was merged into
// this row. Useful for debugging when tracing back why a given runtime row
// subsumed an old one, and only overwrites NULL so the earliest merge is
// preserved.
func (q *Queries) RecordRuntimeLegacyDaemonID(ctx context.Context, arg RecordRuntimeLegacyDaemonIDParams) error {
_, err := q.db.Exec(ctx, recordRuntimeLegacyDaemonID, arg.ID, arg.LegacyDaemonID)
return err
}
const setAgentRuntimeOffline = `-- name: SetAgentRuntimeOffline :exec
UPDATE agent_runtime
SET status = 'offline', updated_at = now()
@@ -346,7 +433,7 @@ const updateAgentRuntimeHeartbeat = `-- name: UpdateAgentRuntimeHeartbeat :one
UPDATE agent_runtime
SET status = 'online', last_seen_at = now(), updated_at = now()
WHERE id = $1
RETURNING id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id
RETURNING id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id, legacy_daemon_id
`
func (q *Queries) UpdateAgentRuntimeHeartbeat(ctx context.Context, id pgtype.UUID) (AgentRuntime, error) {
@@ -366,6 +453,7 @@ func (q *Queries) UpdateAgentRuntimeHeartbeat(ctx context.Context, id pgtype.UUI
&i.CreatedAt,
&i.UpdatedAt,
&i.OwnerID,
&i.LegacyDaemonID,
)
return i, err
}
@@ -393,7 +481,7 @@ DO UPDATE SET
owner_id = COALESCE(EXCLUDED.owner_id, agent_runtime.owner_id),
last_seen_at = now(),
updated_at = now()
RETURNING id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id
RETURNING id, workspace_id, daemon_id, name, runtime_mode, provider, status, device_info, metadata, last_seen_at, created_at, updated_at, owner_id, legacy_daemon_id
`
type UpsertAgentRuntimeParams struct {
@@ -435,6 +523,7 @@ func (q *Queries) UpsertAgentRuntime(ctx context.Context, arg UpsertAgentRuntime
&i.CreatedAt,
&i.UpdatedAt,
&i.OwnerID,
&i.LegacyDaemonID,
)
return i, err
}

View File

@@ -79,24 +79,49 @@ SELECT count(*) FROM agent WHERE runtime_id = $1 AND archived_at IS NULL;
-- name: DeleteArchivedAgentsByRuntime :exec
DELETE FROM agent WHERE runtime_id = $1 AND archived_at IS NOT NULL;
-- name: MigrateAgentsToRuntime :execrows
-- Migrates agents from stale offline runtimes to the newly registered runtime.
-- Only migrates from runtimes that match the same workspace, provider, owner,
-- AND whose daemon_id starts with the current daemon_id followed by '-'.
-- This scopes migration to old profile-suffixed runtimes from the same machine
-- (e.g. "MacBook-staging" matches daemon_id_prefix "MacBook") without touching
-- runtimes from other machines belonging to the same user.
-- name: FindLegacyRuntimesByDaemonID :many
-- Looks up runtime rows keyed on a prior (hostname-derived) daemon_id. Used
-- at register-time to find rows owned by the same machine under its old
-- identity so agents/tasks can be re-pointed at the new UUID-keyed row.
--
-- Comparison is case-insensitive because os.Hostname() has been observed to
-- return different casings on the same machine (e.g. `Jiayuans-MacBook-Pro`
-- vs `jiayuans-macbook-pro`) across reboots/mDNS state changes. A case-
-- sensitive `=` would strand the old row; LOWER() on both sides handles drift
-- without forcing the daemon to enumerate cased permutations.
--
-- Returns many rather than one because case drift may have already minted
-- duplicate rows historically (e.g. `Foo.local` AND `foo.local` under the
-- same workspace+provider). A single-row lookup would consolidate only one
-- of them and leave the rest orphaned. Callers must merge every returned
-- row into the new UUID-keyed runtime.
SELECT * FROM agent_runtime
WHERE workspace_id = @workspace_id
AND provider = @provider
AND LOWER(daemon_id) = LOWER(@daemon_id);
-- name: ReassignAgentsToRuntime :execrows
-- Re-points every agent referencing old_runtime_id at new_runtime_id.
UPDATE agent
SET runtime_id = @new_runtime_id
WHERE runtime_id IN (
SELECT ar.id FROM agent_runtime ar
WHERE ar.workspace_id = @workspace_id
AND ar.provider = @provider
AND ar.owner_id = @owner_id
AND ar.id != @new_runtime_id
AND ar.status = 'offline'
AND ar.daemon_id LIKE @daemon_id_prefix || '-%'
);
WHERE runtime_id = @old_runtime_id;
-- name: ReassignTasksToRuntime :execrows
-- Re-points every queued/running/completed task referencing old_runtime_id.
-- Required before deleting the old runtime row because agent_task_queue has
-- an ON DELETE CASCADE FK that would otherwise drop historical tasks.
UPDATE agent_task_queue
SET runtime_id = @new_runtime_id
WHERE runtime_id = @old_runtime_id;
-- name: RecordRuntimeLegacyDaemonID :exec
-- Remembers the most recent hostname-derived daemon_id that was merged into
-- this row. Useful for debugging when tracing back why a given runtime row
-- subsumed an old one, and only overwrites NULL so the earliest merge is
-- preserved.
UPDATE agent_runtime
SET legacy_daemon_id = COALESCE(legacy_daemon_id, $2)
WHERE id = $1;
-- name: DeleteStaleOfflineRuntimes :many
-- Deletes runtimes that have been offline for longer than the TTL and have