Files
agent/docs/gateway.md
highperfocused 089bd7bd48
All checks were successful
Build and Push Docker Image / build (push) Successful in 1m15s
Add HTTP gateway with streaming chat and multi-client conversation mapping (#1)
Reviewed-on: MoA/agent#1
Co-authored-by: highperfocused <highperfocused@pm.me>
Co-committed-by: highperfocused <highperfocused@pm.me>
2026-03-12 14:47:45 +01:00

5.5 KiB

Gateway: how it works

This document explains how the HTTP gateway in this repository works.

Overview

The gateway is a thin HTTP layer around @mariozechner/pi-coding-agent sessions.

Main goals:

  • expose chat over HTTP (/v1/chat, /v1/chat/stream)
  • keep long-lived conversation state per conversationId
  • support adapter-friendly IDs (Slack/Matrix/etc.)
  • optionally expose a built-in browser UI at /

Key source files:

  • src/index.ts
  • src/gateway/server.ts
  • src/conversation-manager.ts
  • src/agent-session-factory.ts
  • src/gateway/events.ts

Startup flow

  1. src/index.ts loads env config via loadConfig().
  2. If RUN_MODE=single, one-shot mode is executed and exits.
  3. Otherwise (RUN_MODE=gateway), it:
    • creates ConversationManager
    • initializes persisted conversation metadata (if enabled)
    • starts GatewayHttpServer
  4. On SIGINT/SIGTERM, it stops the HTTP server and disposes sessions.

Core components

1) GatewayHttpServer (src/gateway/server.ts)

Responsible for:

  • request routing
  • auth and CORS handling
  • request validation
  • SSE streaming responses
  • JSON/HTML responses

2) ConversationManager (src/conversation-manager.ts)

Responsible for:

  • creating and tracking conversation records
  • loading/opening/creating agent sessions
  • serializing prompts per conversation (queue)
  • persisting conversation index + session metadata
  • aborting/deleting sessions

3) AgentSessionFactory (src/agent-session-factory.ts)

Responsible for constructing agent sessions with:

  • model/provider selection (including Ollama support)
  • tool selection (all, readonly, none, or subset)
  • optional system prompt override/append
  • auth storage and model registry wiring

Conversation model

A conversation is identified by conversationId.

  • If client provides no ID, a UUID is generated.
  • Each conversation maps to one AgentSession.
  • Multiple requests for the same conversation are queued and processed in order.
  • Metadata is exposed via /v1/conversations endpoints.

Validation rules:

  • conversationId max length: 200
  • conversationId must not contain \n/\r
  • message must be a non-empty string
  • images must be an array when provided
  • streamingBehavior must be "steer" or "followUp" when provided

Persistence behavior

Controlled by SESSION_PERSIST.

SESSION_PERSIST=true

Data is stored under:

  • <CWD>/.gateway/conversations.json (conversation index)
  • <CWD>/.gateway/sessions/... (session files)

At startup, the index is loaded and conversations are restored as unloaded records. The actual AgentSession is lazily opened when that conversation is used.

SESSION_PERSIST=false

Everything is in memory and lost on process exit.


API routes

Health/UI

  • GET /health{ "ok": true }
  • GET / → built-in Web UI HTML (if GATEWAY_ENABLE_WEB_UI=true)

Conversation management

  • GET /v1/conversations
  • POST /v1/conversations
  • GET /v1/conversations/:id
  • DELETE /v1/conversations/:id
  • POST /v1/conversations/:id/abort

Chat

  • POST /v1/chat (JSON response)
  • POST /v1/chat/stream (SSE response)

Adapter endpoints

  • POST /v1/adapters/chat
  • POST /v1/adapters/chat/stream

Adapter request fields (source, workspaceId, channelId, threadId, userId) are normalized into:

  • conversationId = source:workspaceId:channelId:threadId
  • adapterKey = source:workspaceId:channelId:threadId:userId

channelId is required. : is not allowed inside segment values.

For practical setup patterns per transport (Web UI, Slack, Matrix, custom), see docs/channels.md.


Streaming (SSE) behavior

For /v1/chat/stream and /v1/adapters/chat/stream:

  1. Response starts with SSE headers.
  2. A ready event is emitted.
  3. Agent session events are mapped to gateway events (src/gateway/events.ts).
  4. A final done event is emitted with summary payload.
  5. On failure, an error event is emitted and stream ends.

Common emitted event types:

  • assistant_text_delta
  • assistant_thinking_delta
  • assistant_message_update
  • tool_start, tool_update, tool_end
  • agent_start, agent_end
  • retry_start, retry_end
  • compaction_start, compaction_end
  • done
  • error

done includes:

  • conversationId
  • sessionId
  • sessionFile
  • assistantText
  • plus adapterKey on adapter streaming routes

Disconnect behavior:

  • if client disconnects mid-stream and the request had a conversationId, the server attempts to abort that conversation.

Auth and CORS

Bearer auth

If GATEWAY_AUTH_TOKEN is set, requests must include:

Authorization: Bearer <token>

Otherwise server returns 401.

Note: auth is checked before route handling, so this applies to all routes (including GET / and GET /health).

CORS

If GATEWAY_CORS_ORIGIN is set, server adds:

  • Access-Control-Allow-Origin
  • Access-Control-Allow-Headers: Content-Type, Authorization
  • Access-Control-Allow-Methods: GET, POST, DELETE, OPTIONS

OPTIONS preflight returns 204.


Request limits and errors

  • JSON body max size: 1 MiB (413 if exceeded)
  • invalid JSON: 400
  • invalid payload field types: 400
  • unknown route: 404
  • unexpected errors: 500

Environment variables (gateway-relevant)

  • RUN_MODE (gateway | single)
  • GATEWAY_HOST
  • GATEWAY_PORT
  • GATEWAY_CORS_ORIGIN
  • GATEWAY_AUTH_TOKEN
  • GATEWAY_ENABLE_WEB_UI
  • SESSION_PERSIST
  • VERBOSE_TOOLS
  • CWD

See .env.example for complete defaults and comments.