Skip to main content

What is agent memory in browser agents?

What is agent memory in browser agents?
Lucas Giordano's avatarBy Lucas Giordano · Co-founder, Notte
Last updated
TL;DR

Agent memory is the layer that lets a browser agent reuse what it learned earlier — within a single run (working memory), across runs of the same task (episodic), or as long-lived state (persistent). Without memory, every step is a cold start. With it, the agent recovers from errors, avoids re-exploring the same dead ends, and gets faster on tasks it's seen before.

What is agent memory in browser agents?

The model on the inside of a browser agent is stateless. Each call sees only what's in the prompt — the user's goal, whatever observation was just captured, and whatever the agent layer chose to include from the past. "Memory" in this context isn't model memory; it's the discipline of deciding what to feed back into the next prompt so the agent doesn't re-explore the same dead end every iteration. Get that right and the agent compounds knowledge across steps and runs. Get it wrong and it spends three steps re-discovering the cookie banner it dismissed on step one.

Three layers of memory

Browser agents use memory at three different time scales, each addressing a different failure mode:

  1. Working memory. The recent action history within a single run. Each iteration's prompt includes the last N actions, their outcomes, and the agent's running plan. This is what stops the agent from clicking the same button five times in a row when it doesn't update the page state. Almost every agent stack does this; it's table stakes.
  2. Episodic memory. State that survives across runs of the same task. "Last time we tried this flow on this site, the consent banner was hidden behind a Shadow DOM and clicking it required a JavaScript event." That kind of learned fact gets pulled into the next run's context so the agent doesn't repeat the same exploration. Less common in production today; emerging as the next layer of reliability.
  3. Persistent state outside the agent. The browser's cookies, localStorage, cache — what survives between sessions independent of the model. Strictly speaking this is session persistence, not agent memory, but in practice it serves the same function: the agent doesn't have to re-authenticate or re-accept cookies every run. Pair with a browser profile for the strongest version.

The honest comparison:

Working memoryEpisodic memoryBrowser persistence
LifespanWithin one runAcross runs of one taskAcross runs of one user/account
Stored wherePrompt contextVector store / structured logCookies, localStorage, profile
What it remembersRecent actions, planPatterns, dead ends, site-specific quirksAuthenticated state, preferences
Failure if absentLoops on stuck stepsRe-explores the same path each runRe-authenticates every run
Implementation costTrivial — include in promptReal — design the storeProfile-managed

What good working memory looks like

Three properties production stacks land on:

  • Action + outcome + observation hash. Each step records what was attempted, whether it succeeded, and a compact representation of the resulting page state. The next step's prompt sees the recent slice. That's enough to detect "I clicked, page didn't change" without re-reading the entire DOM.
  • The agent's running plan. A short summary of what it's trying to do across the next several steps. Without this, every iteration starts from "what was I doing?" Adding ~200 tokens of plan context dramatically reduces wasted exploration.
  • Bounded context size. The temptation is to include everything; the cost is context-window exhaustion. Production stacks sample (most recent N actions) or summarize (every M actions, compress the older ones into a single paragraph).

When episodic memory pays off

Episodic memory adds engineering cost and storage overhead. The cases where it earns the cost:

  • Tasks that run regularly against the same target. A daily report-pull from the same dashboard. Every run after the first should be near-instant; without episodic memory, every run rediscovers the navigation path.
  • Sites with non-obvious quirks. Sites that gate pages behind interstitials, embed challenges in iframes, or hide critical elements off-screen. Once the agent figures out the trick, it should remember.
  • Multi-tenant products. Each customer's account has its own quirks. Per-customer episodic memory amortizes discovery cost across the customer's runs.

If the task is genuinely one-off, episodic memory is overkill — working memory is enough.

Common pitfalls

  • Skipping working memory entirely. Surprisingly common in early agent prototypes; the agent loops on the first stuck step. Always include recent actions and outcomes in the next prompt.
  • Including the entire DOM in working memory. Context-window exhaustion in three iterations. Hash, summarize, or include only deltas.
  • Conflating agent memory with session persistence. They solve different problems. Memory is about the agent's plan; persistence is about the browser's state. Both matter; neither replaces the other.

Key takeaways

  • Agent memory is the discipline of deciding what past state to include in the next prompt; the model itself is stateless.
  • Three layers, three time scales: working (within a run), episodic (across runs of one task), browser persistence (across runs for one user).
  • Working memory is table stakes — recent actions, outcomes, running plan. Skip it and the agent loops on stuck steps.
  • Episodic memory pays off on recurring tasks, quirky sites, and multi-tenant products; skip it for one-offs.
  • Browser persistence (cookies, profiles) is the layer below — same goal of not re-doing work, different scope. See session persistence and browser profiles.

Build your AI agent on the open web with Notte

Cloud browsers, agent identities, and the Anything API — everything you need to ship reliable browser agents in production.