What is agent memory in browser agents?

Agent memory is the layer that lets a browser agent reuse what it learned earlier — within a single run (working memory), across runs of the same task (episodic), or as long-lived state (persistent). Without memory, every step is a cold start. With it, the agent recovers from errors, avoids re-exploring the same dead ends, and gets faster on tasks it's seen before.
What is agent memory in browser agents?
The model on the inside of a browser agent is stateless. Each call sees only what's in the prompt — the user's goal, whatever observation was just captured, and whatever the agent layer chose to include from the past. "Memory" in this context isn't model memory; it's the discipline of deciding what to feed back into the next prompt so the agent doesn't re-explore the same dead end every iteration. Get that right and the agent compounds knowledge across steps and runs. Get it wrong and it spends three steps re-discovering the cookie banner it dismissed on step one.
Three layers of memory
Browser agents use memory at three different time scales, each addressing a different failure mode:
- Working memory. The recent action history within a single run. Each iteration's prompt includes the last N actions, their outcomes, and the agent's running plan. This is what stops the agent from clicking the same button five times in a row when it doesn't update the page state. Almost every agent stack does this; it's table stakes.
- Episodic memory. State that survives across runs of the same task. "Last time we tried this flow on this site, the consent banner was hidden behind a Shadow DOM and clicking it required a JavaScript event." That kind of learned fact gets pulled into the next run's context so the agent doesn't repeat the same exploration. Less common in production today; emerging as the next layer of reliability.
- Persistent state outside the agent. The browser's cookies, localStorage, cache — what survives between sessions independent of the model. Strictly speaking this is session persistence, not agent memory, but in practice it serves the same function: the agent doesn't have to re-authenticate or re-accept cookies every run. Pair with a browser profile for the strongest version.
The honest comparison:
| Working memory | Episodic memory | Browser persistence | |
|---|---|---|---|
| Lifespan | Within one run | Across runs of one task | Across runs of one user/account |
| Stored where | Prompt context | Vector store / structured log | Cookies, localStorage, profile |
| What it remembers | Recent actions, plan | Patterns, dead ends, site-specific quirks | Authenticated state, preferences |
| Failure if absent | Loops on stuck steps | Re-explores the same path each run | Re-authenticates every run |
| Implementation cost | Trivial — include in prompt | Real — design the store | Profile-managed |
What good working memory looks like
Three properties production stacks land on:
- Action + outcome + observation hash. Each step records what was attempted, whether it succeeded, and a compact representation of the resulting page state. The next step's prompt sees the recent slice. That's enough to detect "I clicked, page didn't change" without re-reading the entire DOM.
- The agent's running plan. A short summary of what it's trying to do across the next several steps. Without this, every iteration starts from "what was I doing?" Adding ~200 tokens of plan context dramatically reduces wasted exploration.
- Bounded context size. The temptation is to include everything; the cost is context-window exhaustion. Production stacks sample (most recent N actions) or summarize (every M actions, compress the older ones into a single paragraph).
When episodic memory pays off
Episodic memory adds engineering cost and storage overhead. The cases where it earns the cost:
- Tasks that run regularly against the same target. A daily report-pull from the same dashboard. Every run after the first should be near-instant; without episodic memory, every run rediscovers the navigation path.
- Sites with non-obvious quirks. Sites that gate pages behind interstitials, embed challenges in iframes, or hide critical elements off-screen. Once the agent figures out the trick, it should remember.
- Multi-tenant products. Each customer's account has its own quirks. Per-customer episodic memory amortizes discovery cost across the customer's runs.
If the task is genuinely one-off, episodic memory is overkill — working memory is enough.
Common pitfalls
- Skipping working memory entirely. Surprisingly common in early agent prototypes; the agent loops on the first stuck step. Always include recent actions and outcomes in the next prompt.
- Including the entire DOM in working memory. Context-window exhaustion in three iterations. Hash, summarize, or include only deltas.
- Conflating agent memory with session persistence. They solve different problems. Memory is about the agent's plan; persistence is about the browser's state. Both matter; neither replaces the other.
Key takeaways
- Agent memory is the discipline of deciding what past state to include in the next prompt; the model itself is stateless.
- Three layers, three time scales: working (within a run), episodic (across runs of one task), browser persistence (across runs for one user).
- Working memory is table stakes — recent actions, outcomes, running plan. Skip it and the agent loops on stuck steps.
- Episodic memory pays off on recurring tasks, quirky sites, and multi-tenant products; skip it for one-offs.
- Browser persistence (cookies, profiles) is the layer below — same goal of not re-doing work, different scope. See session persistence and browser profiles.