The highest-level approach. Handles message accumulation, stop signals, turn lifecycle, and auto-piping automatically.
To fix a customUIMessage subtype or typed client data schema, use the ChatBuilder via chat.withUIMessage<...>() and/or chat.withClientData({ schema }). Builder-level hooks can also be chained before .agent(). See Types.
Every chat.agent conversation is backed by a durable Session — externalId is your chatId, type is "chat.agent", taskIdentifier is the agent’s task ID. The session is the run manager: it owns the chat’s runs, persists across run lifecycles, and orchestrates handoffs (idle continuation, chat.requestUpgrade). You rarely need to touch the session directly (chat.stream, chat.messages, chat.stopSignal wrap everything), but payload.sessionId is available if you want to reach in — e.g. sessions.open(payload.sessionId) to write from a sub-agent or from outside the turn loop.
For complex agent flows where streamText is called deep inside your code, use chat.pipe(). It works from anywhere inside a task — even nested function calls.
trigger/agent-chat.ts
import { chat } from "@trigger.dev/sdk/ai";import { streamText } from "ai";import { openai } from "@ai-sdk/openai";import type { ModelMessage } from "ai";export const agentChat = chat.agent({ id: "agent-chat", run: async ({ messages }) => { // Don't return anything — chat.pipe is called inside await runAgentLoop(messages); },});async function runAgentLoop(messages: ModelMessage[]) { // ... agent logic, tool calls, etc. const result = streamText({ model: openai("gpt-4o"), messages, }); // Pipe from anywhere — no need to return it await chat.pipe(result);}
Every chat lifecycle callback and the run payload include ctx: the same run context object as task({ run: (payload, { ctx }) => ... }). Import the type with import type { TaskRunContext } from "@trigger.dev/sdk" (the Context export is the same type). Use ctx for tags, metadata, or any API that needs the full run record. The string runId on chat events is always ctx.run.id (both are provided for convenience). See Task context (ctx) in the API reference.Standard task lifecycle hooks — onWait, onResume, onComplete, onFailure, etc. — are also available on chat.agent() with the same shapes as on a normal task().Chat agents also have two dedicated suspension hooks — onChatSuspend and onChatResume — that fire at the idle-to-suspended transition with full chat context. Use them for resource cleanup (e.g. tearing down sandboxes) and re-initialization. See onChatSuspend / onChatResume and the Code execution sandbox pattern.
Fires when a preloaded run starts — before any messages arrive. Use it to eagerly initialize state (DB records, user context) while the user is still typing.Preloaded runs are triggered by calling transport.preload(chatId) on the frontend. See Preload for details.
Every lifecycle callback receives a writer — a lazy stream writer that lets you send custom UIMessageChunk parts (like data-* parts) to the frontend. Non-transient data-* chunks written via the writer are automatically added to the response message and available in onTurnComplete. Add transient: true for ephemeral chunks (progress indicators, etc.) that should not persist. See Custom data parts.
Fires once on the first turn (turn 0) before run() executes. Use it to create a chat record in your database.The continuation field tells you whether this is a brand new chat or a continuation of an existing one (where the previous run timed out or was cancelled). The preloaded field tells you whether onPreload already ran.
export const myChat = chat.agent({ id: "my-chat", onChatStart: async ({ chatId, clientData, continuation, preloaded }) => { if (preloaded) return; // Already set up in onPreload if (continuation) return; // Chat record already exists const { userId } = clientData as { userId: string }; await db.chat.create({ data: { id: chatId, userId, title: "New chat" }, }); }, run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); },});
clientData contains custom data from the frontend — either the clientData option on the
transport constructor (sent with every message) or the metadata option on sendMessage()
(per-message). See Client data and metadata.
Validate or transform incoming UIMessage[] before they are converted to model messages. Fires once per turn with the raw messages from the wire payload (after cleanup of aborted tool parts), before accumulation and toModelMessages().Return the validated messages array. Throw to abort the turn with an error.This is the right place to call the AI SDK’s validateUIMessages to catch malformed messages from storage or untrusted input before they reach the model — especially useful when persisting conversations to a database, where tool schemas may drift between deploys.
onValidateMessages fires beforeonTurnStart and message accumulation. If you need to validate messages loaded from a database, do the loading in onChatStart or onPreload and let onValidateMessages validate the full incoming set each turn.
Load the full message history from your backend on every turn, replacing the built-in linear accumulator. When set, the hook’s return value becomes the accumulated state — the normal accumulation logic (append for submit, replace for regenerate) is skipped entirely.Use this when the backend should be the source of truth for message history — abuse prevention, branching conversations (DAGs), or rollback/undo support.
Lifecycle position:onValidateMessages → hydrateMessages → onChatStart (turn 0) → onTurnStart → run()After the hook returns, any incoming wire message whose ID matches a hydrated message is auto-merged — this makes tool approvals work transparently with hydration.
hydrateMessages also fires for action turns (trigger: "action") with empty incomingMessages. This lets the action handler work with the latest DB state.
Fires at the start of every turn, after message accumulation and onChatStart (turn 0), but beforerun() executes. Use it to persist messages before streaming begins — so a mid-stream page refresh still shows the user’s message.
By persisting in onTurnStart, the user’s message is saved to your database before the AI starts
streaming. If the user refreshes mid-stream, the message is already there.
Fires after the response is captured but before the stream closes. The writer can send custom chunks that appear in the current turn — use this for post-processing indicators, compaction progress, or any data the user should see before the turn ends.
export const myChat = chat.agent({ id: "my-chat", onBeforeTurnComplete: async ({ writer, usage, uiMessages }) => { // Write a custom data part while the stream is still open writer.write({ type: "data-usage-summary", data: { tokens: usage?.totalTokens, messageCount: uiMessages.length, }, }); // You can also compact messages here and write progress if (usage?.totalTokens && usage.totalTokens > 50_000) { writer.write({ type: "data-compaction", data: { status: "compacting" } }); chat.setMessages(compactedMessages); writer.write({ type: "data-compaction", data: { status: "complete" } }); } }, run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); },});
Fires after each turn completes — after the response is captured and the stream is closed. This is the primary hook for persisting the assistant’s response. Does not include a writer since the stream is already closed.
Use uiMessages to overwrite the full conversation each turn (simplest). Use newUIMessages if
you prefer to store messages individually — for example, one database row per message.
Persist lastEventId alongside the session. When the transport reconnects after a page refresh,
it uses this to skip past already-seen events — preventing duplicate messages.
For a full conversation + session persistence pattern (including preload, continuation, and token renewal), see Database persistence.
Chat-specific hooks that fire at the idle-to-suspended transition — the moment the run stops using compute and waits for the next message. These replace the need for the generic onWait / onResume task hooks for chat-specific work.The phase discriminator tells you when the suspend/resume happened:
"preload" — after onPreload, waiting for the first message
"turn" — after onTurnComplete, waiting for the next message
export const myChat = chat.agent({ id: "my-chat", onChatSuspend: async (event) => { // Tear down expensive resources before suspending await disposeCodeSandbox(event.ctx.run.id); if (event.phase === "turn") { logger.info("Suspending after turn", { turn: event.turn }); } }, onChatResume: async (event) => { // Re-initialize after waking up logger.info("Resumed", { phase: event.phase }); }, run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); },});
Field
Type
Description
phase
"preload" | "turn"
Whether this is a preload or post-turn suspension
ctx
TaskRunContext
Full task run context
chatId
string
Chat session ID
runId
string
The Trigger.dev run ID
clientData
Typed by clientDataSchema
Custom data from the frontend
turn
number
Turn number ("turn" phase only)
messages
ModelMessage[]
Accumulated model messages ("turn" phase only)
uiMessages
UIMessage[]
Accumulated UI messages ("turn" phase only)
Unlike onWait (which fires for all wait types — duration, task, batch, token), onChatSuspend fires only at chat suspension points with full chat context. No need to filter on wait.type.
When set to true, a preloaded run completes successfully after the idle timeout elapses instead of suspending. Use this for “fire and forget” preloads — if the user doesn’t send a message during the idle window, the run ends cleanly.
export const myChat = chat.agent({ id: "my-chat", preloadIdleTimeoutInSeconds: 10, exitAfterPreloadIdle: true, onPreload: async ({ chatId, clientData }) => { // Eagerly set up state — if no message comes, the run just ends await initializeChat(chatId, clientData); }, run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); },});
Use AI Prompts to manage your system prompt as versioned, overridable config. Store the resolved prompt in a lifecycle hook with chat.prompt.set(), then spread chat.toStreamTextOptions() into streamText — it includes the system prompt, model, config, and telemetry automatically.
chat.toStreamTextOptions() returns an object with system, model (resolved via the registry), temperature, and experimental_telemetry — all from the stored prompt. Properties you set after the spread (like a client-selected model) take precedence.
See Prompts for the full guide — defining templates, variable schemas, dashboard
overrides, and the management SDK.
Calling stop() from useChat sends a stop signal to the running task via input streams. The task’s streamText call aborts (if you passed signal or stopSignal), but the run stays alive and waits for the next message. The partial response is captured and accumulated normally.
Use signal (the combined signal) in most cases. The separate stopSignal and cancelSignal are
only needed if you want different behavior for stop vs cancel.
You can also check stop status from anywhere during a turn using chat.isStopped(). This is useful inside streamText’s onFinish callback where the AI SDK’s isAborted flag can be unreliable (e.g. when using createUIMessageStream + writer.merge()):
import { chat } from "@trigger.dev/sdk/ai";import { streamText } from "ai";export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal, onFinish: ({ isAborted }) => { // isAborted may be false even after stop when using createUIMessageStream const wasStopped = isAborted || chat.isStopped(); if (wasStopped) { // handle stop — e.g. log analytics } }, }); },});
When stop happens mid-stream, the captured response message can contain parts in an incomplete state — tool calls stuck in partial-call, reasoning blocks still marked as streaming, etc. These can cause UI issues like permanent spinners.chat.agent automatically cleans up the responseMessage when stop is detected before passing it to onTurnComplete. If you use chat.pipe() manually and capture response messages yourself, use chat.cleanupAbortedParts():
This removes tool invocation parts stuck in partial-call state and marks any streaming text or reasoning parts as done.
Stop signal delivery is best-effort. There is a small race window where the model may finish
before the stop signal arrives, in which case the turn completes normally with stopped: false.
This is expected and does not require special handling.
Tools with needsApproval: true pause execution until the user approves or denies via the frontend. Define the tool as normal and pass it to streamText — chat.agent handles the rest:
const sendEmail = tool({ description: "Send an email. Requires human approval.", inputSchema: z.object({ to: z.string(), subject: z.string(), body: z.string() }), needsApproval: true, execute: async ({ to, subject, body }) => { await emailService.send({ to, subject, body }); return { sent: true }; },});export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, tools: { sendEmail }, abortSignal: signal, }); },});
When the model calls an approval-required tool, the turn completes with the tool in approval-requested state. After the user approves on the frontend, the updated message is sent back and chat.agent replaces it in the conversation accumulator by matching the message ID. streamText then executes the approved tool and continues.See Tool approvals in the frontend docs for the UI setup.
To build a chat app that survives page refreshes, you need to persist two things:
Messages — The conversation history. Persisted server-side in the task via onTurnStart and onTurnComplete.
Sessions — The transport’s connection state (runId, publicAccessToken, lastEventId). Persisted server-side via onTurnStart and onTurnComplete.
Sessions let the transport reconnect to an existing run after a page refresh. Without them, every
page load would start a new run — losing the conversation context that was accumulated in the
previous run.
The example below trusts raw chatId and returns rows without filtering by user. In a real multi-user app, scope every query by the authenticated user — read the user from your auth/session in each server action and add where: { userId } to all db.chat.* and db.chatSession.* queries. Without that, one client could read or delete another user’s chat state, and getAllSessions() would leak other users’ publicAccessTokens. The snippet keeps auth out of the way to focus on the persistence shape.
Users can send messages while the agent is executing tool calls. With pendingMessages, these messages are injected between tool-call steps, steering the agent mid-execution:
On the frontend, the usePendingMessages hook handles sending, tracking, and rendering injection points.
See Pending Messages for the full guide — backend configuration,
frontend hook, queuing vs steering, and how injection works with all three chat variants.
Inject context from background work into the conversation using chat.inject(). Combine with chat.defer() to run analysis between turns and inject results before the next response — self-review, RAG augmentation, safety checks, etc.
Custom actions let the frontend send structured commands (undo, rollback, edit) that modify the conversation state before the LLM responds. Actions use the same input stream as messages, so they wake the agent from suspension and trigger a full turn.Define an actionSchema for validation and an onAction handler that uses chat.history to modify state:
The action payload is validated against actionSchema on the backend — invalid actions throw and abort the turn. The action parameter in onAction is fully typed from the schema.
Actions always trigger run() — the LLM responds to the modified state. For silent state changes that don’t need a response (e.g. injecting background context), use chat.inject() instead.
Imperative API for modifying the accumulated message history. Works from any hook (onAction, onTurnStart, onBeforeTurnComplete, onTurnComplete) or from run() and AI SDK tools.
Method
Description
chat.history.all()
Read the current accumulated UI messages (returns a copy)
chat.history.set(messages)
Replace all messages (same as chat.setMessages())
chat.history.remove(messageId)
Remove a specific message by ID
chat.history.rollbackTo(messageId)
Keep messages up to and including the given ID (undo)
chat.history.replace(messageId, message)
Replace a specific message by ID (edit)
chat.history.slice(start, end?)
Keep only messages in the given range
// Undo the last exchange in onActiononAction: async ({ action }) => { if (action.type === "undo") { chat.history.slice(0, -2); }},// Trim history in onTurnCompleteonTurnComplete: async ({ uiMessages }) => { if (uiMessages.length > 50) { chat.history.slice(-20); }},
Mutations use the same deferred mechanism as chat.setMessages() — they are applied at lifecycle checkpoints (after hooks return). Multiple mutations in the same hook compose correctly.
Transform model messages before they’re used anywhere — in run(), in compaction rebuilds, and in compaction results. Define once, applied everywhere.Use this for Anthropic cache breaks, injecting system context, stripping PII, etc.
Chat agent runs are pinned to the worker version they started on. When you deploy a new version, suspended runs resume on the old code. Call chat.requestUpgrade() in onTurnStart to skip run() and exit immediately — the transport re-triggers the same message on the latest version. See the Version Upgrades pattern for the full guide.
By default, a chat agent stays idle after each turn waiting for the next user message. Call chat.endRun() from run(), chat.defer(), onBeforeTurnComplete, or onTurnComplete to exit the loop once the current turn finishes — no upgrade signal, no idle wait.
chat.agent({ id: "one-shot", run: async ({ messages, signal }) => { // Single-response agent — exit after this turn. chat.endRun(); return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); },});
The current turn streams through normally, onBeforeTurnComplete / onTurnComplete fire, the turn-complete chunk is written, and the run exits instead of suspending. The next user message on the same chatId starts a fresh run via the standard continuation flow.Use this when the agent knows its work is done (budget exhausted, goal achieved, one-shot response) rather than relying on the idle timeout. Unlike chat.requestUpgrade(), no upgrade-required signal is sent to the client, so there’s no version-migration semantics.
Override how long the run stays idle (active, using compute) after each turn:
run: async ({ messages, signal }) => { chat.setIdleTimeoutInSeconds(60); // Stay idle for 1 minute return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal });},
Longer idle timeout means faster responses but more compute usage. Set to 0 to suspend
immediately after each turn (minimum latency cost, slight delay on next message).
Control how streamText results are converted to the frontend stream via toUIMessageStream(). Set static defaults on the task, or override per-turn.
Error handling with onError
When streamText encounters an error mid-stream (rate limits, API failures, network errors), the onError callback converts it to a string that’s sent to the frontend as an { type: "error", errorText } chunk. The AI SDK’s useChat receives this via its onError callback.By default, the raw error message is sent to the frontend. Use onError to sanitize errors and avoid leaking internal details:
export const myChat = chat.agent({ id: "my-chat", uiMessageStreamOptions: { onError: (error) => { // Log the full error server-side for debugging console.error("Stream error:", error); // Return a sanitized message — this is what the frontend sees if (error instanceof Error && error.message.includes("rate limit")) { return "Rate limited — please wait a moment and try again."; } return "Something went wrong. Please try again."; }, }, run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); },});
onError is also called for tool execution errors, so a single handler covers both LLM errors and tool failures.On the frontend, handle the error in useChat:
const { messages, sendMessage } = useChat({ transport, onError: (error) => { // error.message contains the string returned by your onError handler toast.error(error.message); },});
Reasoning and sources
Control which AI SDK features are forwarded to the frontend:
By default, response message IDs are generated using the AI SDK’s built-in generateId. Pass a custom generateMessageId function to use your own ID format (e.g. UUID-v7):
import { v7 as uuidv7 } from "uuid";export const myChat = chat.agent({ id: "my-chat", uiMessageStreamOptions: { generateMessageId: () => uuidv7(), }, run: async ({ messages, signal }) => { return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); },});
With the .withUIMessage() builder, set it under streamOptions:
The generated ID is sent to the frontend in the stream’s start chunk, so frontend and backend
always reference the same ID for each message. This is important for features like tool
approvals, where the frontend resends an assistant message and the backend needs to match it
by ID in the conversation accumulator.
Per-turn overrides
Override per-turn with chat.setUIMessageStreamOptions() — per-turn values merge with the static config (per-turn wins on conflicts). The override is cleared automatically after each turn.
run: async ({ messages, clientData, signal }) => { // Enable reasoning only for certain models if (clientData.model?.includes("claude")) { chat.setUIMessageStreamOptions({ sendReasoning: true }); } return streamText({ model: openai(clientData.model ?? "gpt-4o"), messages, abortSignal: signal });},
chat.setUIMessageStreamOptions() works across all abstraction levels — chat.agent(), chat.createSession() / turn.complete(), and chat.pipeAndCapture().See ChatUIMessageStreamOptions for the full reference.
onFinish is managed internally for response capture and cannot be overridden here. Use
streamText’s onFinish callback for custom finish handling, or use raw task
mode for full control over toUIMessageStream().
If you need full control over task options, use the standard task() with ChatTaskPayload and chat.pipe():
import { task } from "@trigger.dev/sdk";import { chat, type ChatTaskPayload } from "@trigger.dev/sdk/ai";import { streamText } from "ai";import { openai } from "@ai-sdk/openai";export const manualChat = task({ id: "manual-chat", retry: { maxAttempts: 3 }, queue: { concurrencyLimit: 10 }, run: async (payload: ChatTaskPayload) => { const result = streamText({ model: openai("gpt-4o"), messages: payload.messages, }); await chat.pipe(result); },});
Manual mode does not get automatic message accumulation or the onTurnComplete/onChatStart
lifecycle hooks. The responseMessage field in onTurnComplete will be undefined when using
chat.pipe() directly. Use chat.agent() for the full multi-turn experience.
A middle ground between chat.agent() and raw primitives. You get an async iterator that yields ChatTurn objects — each turn handles stop signals, message accumulation, and turn-complete signaling automatically. You control initialization, model/tool selection, persistence, and any custom per-turn logic.Use chat.createSession() inside a standard task():
import { task } from "@trigger.dev/sdk";import { chat, type ChatTaskWirePayload } from "@trigger.dev/sdk/ai";import { streamText } from "ai";import { openai } from "@ai-sdk/openai";export const myChat = task({ id: "my-chat", run: async (payload: ChatTaskWirePayload, { signal }) => { // One-time initialization — just code, no hooks const clientData = payload.metadata as { userId: string }; await db.chat.create({ data: { id: payload.chatId, userId: clientData.userId } }); const session = chat.createSession(payload, { signal, idleTimeoutInSeconds: 60, timeout: "1h", }); for await (const turn of session) { const result = streamText({ model: openai("gpt-4o"), messages: turn.messages, abortSignal: turn.signal, }); // Pipe, capture, accumulate, and signal turn-complete — all in one call await turn.complete(result); // Persist after each turn await db.chat.update({ where: { id: turn.chatId }, data: { messages: turn.uiMessages }, }); } },});
turn.complete(result) is the easy path — it handles piping, capturing the response, accumulating messages, cleaning up aborted parts, and writing the turn-complete chunk.For more control, you can do each step manually:
for await (const turn of session) { const result = streamText({ model: openai("gpt-4o"), messages: turn.messages, abortSignal: turn.signal, }); // Manual: pipe and capture separately const response = await chat.pipeAndCapture(result, { signal: turn.signal }); if (response) { // Custom processing before accumulating await turn.addResponse(response); } // Custom persistence, analytics, etc. await db.chat.update({ ... }); // Must call done() when not using complete() await turn.done();}
For full control, use a standard task() with the composable primitives from the chat namespace. You manage everything: the turn loop, stop signals, message accumulation, and turn-complete signaling.Raw task mode also lets you call .toUIMessageStream() yourself with any options — including onFinish and originalMessages. This is the right choice when you need complete control over the stream conversion beyond what chat.setUIMessageStreamOptions() provides.