AI Browser Agents

AI browser agents are large-language-model-driven systems that perceive web pages, plan actions, and execute them through a real browser to complete user tasks. They sit at the intersection of three fast-moving fields: agent reasoning (planning, memory, recovery), web interaction (DOM, vision, action grounding), and infrastructure (cloud browsers, sessions, identities). This category defines the vocabulary developers need to design and ship agents that actually work in production — covering computer use, natural-language automation, action and DOM grounding, agent observability, self-healing selectors, evaluation harnesses, and the trade-offs between vision-language and DOM-first approaches. If you're building anything that lets an LLM click, type, or read a webpage, the terms here are the foundation.

12 terms in this category

Common Questions

What is a browser agent?What is computer use (CUA)?What is natural language browser automation?Browser agents vs RPA Browser agents vs traditional web scrapers How do browser agents recover from errors?How does a browser agent perceive a page (vision vs DOM)?What are self-healing selectors?What is agent memory in browser agents?What is browser agent observability?What are browser agent benchmarks (WebArena, Mind2Web, VisualWebArena)?What is a verifier in browser agents?

Other categories

Browser Identity & Auth

Digital identities, credential vaults, 2FA, CAPTCHAs, and the patterns AI agents need to log in like a real user.

Browser Automation

Foundational concepts: headless browsers, cloud browsers, fingerprinting, proxies, sessions, and detection.

Agentic Web APIs

Wrap browser-driven work as callable Web APIs — the layer that exposes agent runs as durable, scheduled, schema-typed endpoints.

Web Scraping

Scraping APIs, anti-scraping defenses, dynamic content, and the patterns for getting data off the modern web.

Web Data for AI

Structured extraction, LLM-ready content, schema-based parsing, and the formats AI systems consume.

Build your AI agent on the open web with Notte

Cloud browsers, agent identities, and the Anything API — everything you need to ship reliable browser agents in production.

Start free See plans