Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.deep.space/llms.txt

Use this file to discover all available pages before exploring further.

Stream multi-turn chat with persistent history and built-in tool use over your records. The scaffold registers four HTTP endpoints, persists every chat to a Durable Object, and streams responses via Vercel AI SDK v5. Use the bundled ChatPanel component for a turnkey UI, or call the streaming endpoint directly and decode it with the wire helpers.

Install the chat feature

Install the bundled feature:
npx deepspace add ai-chat
This installs three files:
  • src/components/ChatPanel.tsx - a reusable chat surface with model picker, abort button, and Markdown rendering.
  • src/pages/assistant.tsx - a full-page assistant with a chat history rail.
  • src/schemas/ai-chat-schema.ts - exports aiChatSchemas (an array of [AI_CHATS_SCHEMA, AI_MESSAGES_SCHEMA]) for spreading into src/schemas.ts.
The feature also adds react-markdown, remark-gfm, remark-breaks, rehype-highlight, and highlight.js as dependencies and runs npm install.

Add the schemas

If you’re wiring chat by hand, import the two pre-built schemas directly:
// src/schemas.ts
import { AI_CHATS_SCHEMA, AI_MESSAGES_SCHEMA } from 'deepspace/worker'

export const schemas = [
  usersSchema,
  settingsSchema,
  AI_CHATS_SCHEMA,
  AI_MESSAGES_SCHEMA,
  // ...your collections
]
Or, if you ran npx deepspace add ai-chat, spread the array the feature installed:
// src/schemas.ts
import { aiChatSchemas } from './schemas/ai-chat-schema'

export const schemas = [usersSchema, settingsSchema, ...aiChatSchemas]
SchemaRowsRBAC
AI_CHATS_SCHEMA (ai-chats)One per chat conversationread/update/delete: 'own', create: false
AI_MESSAGES_SCHEMA (ai-messages)One per message (user or assistant)read/update/delete: 'own', create: false
create: false is intentional. Direct WebSocket creates would let a user PUT a forged role: 'assistant' row that the next turn’s history reads back as if it were a real LLM response. Don’t relax to 'own'. All writes flow through the worker’s chat routes, which validate ownership.

The four chat endpoints

The scaffold registers four endpoints in src/ai/chat-routes.ts. The first three manage chat records; the fourth streams a turn.
POST /api/ai/chats
Authorization: Bearer <jwt>
Content-Type: application/json

{ "title": "Untitled" }
Returns:
{ "chat": { "recordId": "chat_abc", "userId": "...", "title": "Untitled", ... } }
Creates a chat row owned by the JWT subject. The title field is optional.

Streaming pipeline

The POST /api/ai/chat handler runs through these steps:
1

Verify the JWT

Reject anonymous callers with 401.
2

Look up the chat

Return 404 if the chat doesn’t exist or belongs to another user.
3

Load history without persisting the new message

The new user turn is appended in memory only - both user and assistant rows persist together inside onFinish so a stream error leaves zero orphan rows.
4

Prepare messages with compaction

Truncate old tool results, apply a cached summary if one exists, and summarize the older half of the conversation if still over the context budget.
5

Stream the model

Call streamText with the prepared messages, tools, and an abort signal tied to the request.
6

Persist on completion

Write user → assistant → metadata rows in that order, with retry. If the user write fails twice, the assistant write is skipped to keep history paired.

Switch the model

src/ai/chat-routes.ts maps allowed model IDs to providers:
const ALLOWED_MODELS: Record<string, 'anthropic' | 'openai' | 'cerebras'> = {
  'claude-opus-4-7':    'anthropic',
  'claude-sonnet-4-6':  'anthropic',
  'claude-haiku-4-5':   'anthropic',
  'gpt-5.4':            'openai',
  'gpt-5.4-mini':       'openai',
  'gpt-5.4-nano':       'openai',
  'gpt-oss-120b':       'cerebras',
}
const DEFAULT_MODEL = 'claude-sonnet-4-6'
Unknown modelId values are rejected with 400 - there is no silent fallback. Provider routing happens via createDeepSpaceAI:
import { createDeepSpaceAI } from 'deepspace/worker'

const provider = createDeepSpaceAI(env, 'anthropic', { authToken: jwt })
authTokenWho pays
PassedThe caller (signed-in user) - billed against their DeepSpace credits
OmittedThe app owner - billed via APP_OWNER_JWT
The scaffold’s chat routes pass the caller’s JWT, so each user pays for their own conversation. Omit authToken for autonomous server-side calls (cron, server actions).

Tool use

The assistant can read and modify your records via a built-in tool catalog. The scaffold ships all of them in src/ai/tools.ts:
ToolPurpose
schema.listEnumerate collection names
schema.describeDescribe one collection’s columns and permissions
records.queryFilter and list records
records.getFetch one record
records.createCreate a record
records.updatePatch a record
records.deleteDelete a record
user.currentLook up the caller’s user record
Per-collection RBAC at the DO is the security boundary. The user’s own role determines what each tool call can do - the assistant cannot escalate. To run a stricter assistant, trim ALLOWED_TOOL_NAMES to reads only.

Adding custom tools

Extend the ToolSet returned by buildTools in src/ai/tools.ts:
// src/ai/tools.ts
import { tool, type ToolSet } from 'ai'
import { z } from 'zod'
import { BUILT_IN_TOOLS } from 'deepspace/worker'

export function buildTools(executor: ToolExecutor): ToolSet {
  const tools: ToolSet = {}

  // ...existing loop over BUILT_IN_TOOLS...

  tools.lookup_weather = tool({
    description: 'Get weather for a city',
    inputSchema: z.object({ city: z.string() }),
    execute: async ({ city }) => {
      const res = await fetch(`https://api.example.com/weather?city=${encodeURIComponent(city)}`)
      if (!res.ok) return { error: `weather lookup failed: ${res.status}` }
      return await res.json()
    },
  })

  return tools
}
The Zod inputSchema doubles as runtime validation; failing input emits a tool-input-error SSE chunk the client surfaces.

Context compaction

For long conversations, the scaffold automatically compacts older turns to stay under the model’s context budget. The default config exported from deepspace/worker:
import { DEFAULT_CONTEXT_CONFIG } from 'deepspace/worker'

// {
//   contextBudget: 240_000,    // chars - ≈60–80K tokens
//   toolResultCap: 30_000,     // bytes per tool result
//   keepRecentToolResults: 5,
//   minKept: 10,               // sliding-window floor
// }
Tune for shorter-context models by passing your own config to prepareMessagesWithCompaction in chat-routes.ts:
import { DEFAULT_CONTEXT_CONFIG, prepareMessagesWithCompaction } from 'deepspace/worker'

const config = {
  ...DEFAULT_CONTEXT_CONFIG,
  contextBudget: 120_000,    // for 128K-context models
}

const { messages: prepared, newSummary } = await prepareMessagesWithCompaction(
  turns,
  config,
  { summarizer, cachedSummary },
)
The pipeline:
  1. Truncate old tool-result payloads (preserves the most recent N intact).
  2. Apply a cached summary if one exists.
  3. If still over budget, summarize the older half of the conversation.
  4. As a final fallback, apply a sliding window down to minKept messages.

Custom chat UI

If you want to build your own chat surface (sidebar, modal, minimal), call POST /api/ai/chat directly and decode the SSE stream with the SDK’s wire helpers:
import { parseSseLine, decodeAiStreamChunk, getAuthToken, type AiStreamAction } from 'deepspace'

async function streamTurn(chatId: string, content: string, handleAction: (asstId: string, action: AiStreamAction) => void) {
  const token = await getAuthToken()
  const res = await fetch('/api/ai/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${token}` },
    body: JSON.stringify({ chatId, userMessageId: crypto.randomUUID(), content }),
  })

  const asstId = res.headers.get('X-Asst-Id')!
  const reader = res.body!.getReader()
  const decoder = new TextDecoder()
  let buffer = ''

  while (true) {
    const { value, done } = await reader.read()
    if (done) break
    buffer += decoder.decode(value, { stream: true })
    const lines = buffer.split('\n')
    buffer = lines.pop() ?? ''
    for (const line of lines) {
      const chunk = parseSseLine(line)
      if (!chunk) continue
      const action = decodeAiStreamChunk(chunk)
      if (action) handleAction(asstId, action)
    }
  }
}

Action vocabulary

decodeAiStreamChunk returns one of:
ActionWhen it fires
append-textText-delta token from the model
upsert-tool-callA tool invocation started
finalize-tool-callA tool returned its result
fail-tool-inputThe tool’s Zod schema rejected the input
fail-tool-outputThe tool’s execute threw
stream-errorTop-level stream error
abortServer-side abort with no error chunk to follow
For the canonical message list, query ai-messages from inside a RecordScope:
import { useQuery } from 'deepspace'

type AiMessageData = {
  chatId: string
  userId: string
  role: 'user' | 'assistant'
  content: string
  parts?: unknown[]
}

const { records } = useQuery<AiMessageData>('ai-messages', {
  where: { chatId, userId },
  orderBy: 'createdAt',
  orderDir: 'asc',
})

// Each record is { recordId, data, createdAt, updatedAt } - fields live on `.data`.
records.map((r) => ({ id: r.recordId, role: r.data.role, content: r.data.content }))
The parts field on each data holds UI-shape tool invocations for rendering.

Limitations

The Durable Object serializes individual writes, but not the per-request 3-write group (user → assistant → metadata). Two tabs sending turns simultaneously to the same chatId can produce a non-strictly-paired history. Realistic impact is rare.
Each turn can chain up to 5 tool calls. Raise this in chat-routes.ts for agentic workflows that need more steps - each step is a full LLM round-trip with proportional cost.
toUIMessageStreamResponse({ sendReasoning: false }) removes reasoning-* chunks. The default UI has no “thinking” disclosure block, so flipping this on without UI changes shows no progress indication during long reasoning steps.
The header lets the client tag in-flight overlays with a server-generated ID that survives clock skew. If you proxy the streaming response through another worker, preserve the header.

Next steps