Stream multi-turn chat with persistent history and built-in tool use over your records. The scaffold registers four HTTP endpoints, persists every chat to a Durable Object, and streams responses via Vercel AI SDK v5. Use the bundledDocumentation Index
Fetch the complete documentation index at: https://docs.deep.space/llms.txt
Use this file to discover all available pages before exploring further.
ChatPanel component for a turnkey UI, or call the streaming endpoint directly and decode it with the wire helpers.
Install the chat feature
Install the bundled feature:src/components/ChatPanel.tsx- a reusable chat surface with model picker, abort button, and Markdown rendering.src/pages/assistant.tsx- a full-page assistant with a chat history rail.src/schemas/ai-chat-schema.ts- exportsaiChatSchemas(an array of[AI_CHATS_SCHEMA, AI_MESSAGES_SCHEMA]) for spreading intosrc/schemas.ts.
react-markdown, remark-gfm, remark-breaks, rehype-highlight, and highlight.js as dependencies and runs npm install.
Add the schemas
If you’re wiring chat by hand, import the two pre-built schemas directly:npx deepspace add ai-chat, spread the array the feature installed:
| Schema | Rows | RBAC |
|---|---|---|
AI_CHATS_SCHEMA (ai-chats) | One per chat conversation | read/update/delete: 'own', create: false |
AI_MESSAGES_SCHEMA (ai-messages) | One per message (user or assistant) | read/update/delete: 'own', create: false |
The four chat endpoints
The scaffold registers four endpoints insrc/ai/chat-routes.ts. The first three manage chat records; the fourth streams a turn.
- Create chat
- Rename chat
- Delete chat
- Stream a turn
title field is optional.Streaming pipeline
ThePOST /api/ai/chat handler runs through these steps:
Load history without persisting the new message
The new user turn is appended in memory only - both user and assistant rows persist together inside
onFinish so a stream error leaves zero orphan rows.Prepare messages with compaction
Truncate old tool results, apply a cached summary if one exists, and summarize the older half of the conversation if still over the context budget.
Stream the model
Call
streamText with the prepared messages, tools, and an abort signal tied to the request.Switch the model
src/ai/chat-routes.ts maps allowed model IDs to providers:
modelId values are rejected with 400 - there is no silent fallback.
Provider routing happens via createDeepSpaceAI:
authToken | Who pays |
|---|---|
| Passed | The caller (signed-in user) - billed against their DeepSpace credits |
| Omitted | The app owner - billed via APP_OWNER_JWT |
authToken for autonomous server-side calls (cron, server actions).
Tool use
The assistant can read and modify your records via a built-in tool catalog. The scaffold ships all of them insrc/ai/tools.ts:
| Tool | Purpose |
|---|---|
schema.list | Enumerate collection names |
schema.describe | Describe one collection’s columns and permissions |
records.query | Filter and list records |
records.get | Fetch one record |
records.create | Create a record |
records.update | Patch a record |
records.delete | Delete a record |
user.current | Look up the caller’s user record |
Per-collection RBAC at the DO is the security boundary. The user’s own role determines what each tool call can do - the assistant cannot escalate. To run a stricter assistant, trim
ALLOWED_TOOL_NAMES to reads only.Adding custom tools
Extend theToolSet returned by buildTools in src/ai/tools.ts:
inputSchema doubles as runtime validation; failing input emits a tool-input-error SSE chunk the client surfaces.
Context compaction
For long conversations, the scaffold automatically compacts older turns to stay under the model’s context budget. The default config exported fromdeepspace/worker:
prepareMessagesWithCompaction in chat-routes.ts:
- Truncate old tool-result payloads (preserves the most recent N intact).
- Apply a cached summary if one exists.
- If still over budget, summarize the older half of the conversation.
- As a final fallback, apply a sliding window down to
minKeptmessages.
Custom chat UI
If you want to build your own chat surface (sidebar, modal, minimal), callPOST /api/ai/chat directly and decode the SSE stream with the SDK’s wire helpers:
Action vocabulary
decodeAiStreamChunk returns one of:
| Action | When it fires |
|---|---|
append-text | Text-delta token from the model |
upsert-tool-call | A tool invocation started |
finalize-tool-call | A tool returned its result |
fail-tool-input | The tool’s Zod schema rejected the input |
fail-tool-output | The tool’s execute threw |
stream-error | Top-level stream error |
abort | Server-side abort with no error chunk to follow |
ai-messages from inside a RecordScope:
parts field on each data holds UI-shape tool invocations for rendering.
Limitations
Concurrent multi-tab writes can interleave
Concurrent multi-tab writes can interleave
The Durable Object serializes individual writes, but not the per-request 3-write group (user → assistant → metadata). Two tabs sending turns simultaneously to the same
chatId can produce a non-strictly-paired history. Realistic impact is rare.`stopWhen: stepCountIs(5)` caps tool loops
`stopWhen: stepCountIs(5)` caps tool loops
Each turn can chain up to 5 tool calls. Raise this in
chat-routes.ts for agentic workflows that need more steps - each step is a full LLM round-trip with proportional cost.Reasoning content is stripped
Reasoning content is stripped
toUIMessageStreamResponse({ sendReasoning: false }) removes reasoning-* chunks. The default UI has no “thinking” disclosure block, so flipping this on without UI changes shows no progress indication during long reasoning steps.`X-Asst-Id` header is required for dedup
`X-Asst-Id` header is required for dedup
The header lets the client tag in-flight overlays with a server-generated ID that survives clock skew. If you proxy the streaming response through another worker, preserve the header.
Next steps
- Worker AI reference -
createDeepSpaceAI, compaction helpers, chat-history wrappers. - Server actions - privileged routes that bypass user RBAC.
- External APIs - call LLMs and other services through
integration.post.