Memory

AgentsMemory

By default, an agent keeps nothing between runs: every conversation starts completely fresh. The Memory setting changes that: choose Conversation, give it a conversation ID, and everything said under that key is kept and loaded back before the model runs.

What you will learn

Runs start fresh by default

Without memory, a customer who explained everything yesterday has to explain it all again today: the agent has no record that yesterday happened.

One setting and a key

Memory is configured on the Agent block: pick Conversation, choose a conversation ID, and each turn is saved under that key as it happens.

Recall happens before the model runs

On the next run, everything stored under the key is loaded back into the conversation first: so the agent answers like no time has passed.

Keys are separate threads

Each conversation ID is its own thread, so one agent holds a different conversation with every customer. Sliding-window modes keep the most recent messages or tokens when a thread outgrows the model.

Here is the agent from the video with Memory set on the block:

The same agent, with and without memory

The video runs the same agent twice, side by side: once with no memory and once with the conversation ID. The same follow-up question arrives in both. Without the key, the agent starts from zero and has to ask for everything again; with it, everything stored under the key was loaded back before the model saw the new message, and the answer picks up exactly where yesterday stopped.

When conversations grow

Memory can also be a sliding window, keeping the most recent messages, or the most recent tokens, while the oldest quietly fall away. The stored transcript keeps every turn; the window controls how much of it rides into the model on each run.

When to reach for memory

Any agent that talks to the same person or process more than once needs memory: support desks, sales assistants, scheduled check-ins. Stateless agents are the right default for one-shot tasks like classification or extraction, where history would only add cost.

Common Questions

Conversation keeps every turn under a conversation ID. Sliding window (messages) keeps the most recent messages, and sliding window (tokens) keeps as much recent history as fits a token budget.

Any stable key that identifies the thread: a customer ID, a ticket number, a channel ID. Runs that share the key share the conversation; runs with different keys never see each other's history.

Recalled history is loaded into the model's context, so it is billed as input tokens like any other message. Sliding-window modes exist to cap that cost on long-running conversations.

In your Sim workspace, as a stored transcript per conversation ID. You can inspect it, and the agent reads from it automatically before each run.

Agent block

Memory

The same agent, with and without memory

When conversations grow

When to reach for memory

Common Questions

Related documentation