What is agent memory in browser agents?

The model on the inside of a browser agent is stateless. Each call sees only what's in the prompt — the user's goal, whatever observation was just captured, and whatever the agent layer chose to include from the past. "Memory" in this context isn't model memory; it's the discipline of deciding what to feed back into the next prompt so the agent doesn't re-explore the same dead end every iteration. Get that right and the agent compounds knowledge across steps and runs. Get it wrong and it spends three steps re-discovering the cookie banner it dismissed on step one.

Three layers of memory

Browser agents use memory at three different time scales, each addressing a different failure mode:

Working memory. The recent action history within a single run. Each iteration's prompt includes the last N actions, their outcomes, and the agent's running plan. This is what stops the agent from clicking the same button five times in a row when it doesn't update the page state. Almost every agent stack does this; it's table stakes.
Episodic memory. State that survives across runs of the same task. "Last time we tried this flow on this site, the consent banner was hidden behind a Shadow DOM and clicking it required a JavaScript event." That kind of learned fact gets pulled into the next run's context so the agent doesn't repeat the same exploration. Less common in production today; emerging as the next layer of reliability.
Persistent state outside the agent. The browser's cookies, localStorage, cache — what survives between sessions independent of the model. Strictly speaking this is session persistence, not agent memory, but in practice it serves the same function: the agent doesn't have to re-authenticate or re-accept cookies every run. Pair with a browser profile for the strongest version.

The honest comparison:

	Working memory	Episodic memory	Browser persistence
Lifespan	Within one run	Across runs of one task	Across runs of one user/account
Stored where	Prompt context	Vector store / structured log	Cookies, localStorage, profile
What it remembers	Recent actions, plan	Patterns, dead ends, site-specific quirks	Authenticated state, preferences
Failure if absent	Loops on stuck steps	Re-explores the same path each run	Re-authenticates every run
Implementation cost	Trivial — include in prompt	Real — design the store	Profile-managed

What good working memory looks like

Three properties production stacks land on:

Action + outcome + observation hash. Each step records what was attempted, whether it succeeded, and a compact representation of the resulting page state. The next step's prompt sees the recent slice. That's enough to detect "I clicked, page didn't change" without re-reading the entire DOM.
The agent's running plan. A short summary of what it's trying to do across the next several steps. Without this, every iteration starts from "what was I doing?" Adding ~200 tokens of plan context dramatically reduces wasted exploration.
Bounded context size. The temptation is to include everything; the cost is context-window exhaustion. Production stacks sample (most recent N actions) or summarize (every M actions, compress the older ones into a single paragraph).

When episodic memory pays off

Episodic memory adds engineering cost and storage overhead. The cases where it earns the cost:

Tasks that run regularly against the same target. A daily report-pull from the same dashboard. Every run after the first should be near-instant; without episodic memory, every run rediscovers the navigation path.
Sites with non-obvious quirks. Sites that gate pages behind interstitials, embed challenges in iframes, or hide critical elements off-screen. Once the agent figures out the trick, it should remember.
Multi-tenant products. Each customer's account has its own quirks. Per-customer episodic memory amortizes discovery cost across the customer's runs.

If the task is genuinely one-off, episodic memory is overkill — working memory is enough.

Common pitfalls

Skipping working memory entirely. Surprisingly common in early agent prototypes; the agent loops on the first stuck step. Always include recent actions and outcomes in the next prompt.
Including the entire DOM in working memory. Context-window exhaustion in three iterations. Hash, summarize, or include only deltas.
Conflating agent memory with session persistence. They solve different problems. Memory is about the agent's plan; persistence is about the browser's state. Both matter; neither replaces the other.

Key takeaways

Agent memory is the discipline of deciding what past state to include in the next prompt; the model itself is stateless.
Three layers, three time scales: working (within a run), episodic (across runs of one task), browser persistence (across runs for one user).
Working memory is table stakes — recent actions, outcomes, running plan. Skip it and the agent loops on stuck steps.
Episodic memory pays off on recurring tasks, quirky sites, and multi-tenant products; skip it for one-offs.
Browser persistence (cookies, profiles) is the layer below — same goal of not re-doing work, different scope. See session persistence and browser profiles.

What is agent memory in browser agents?