Skip to main content

AI Browser Agents

AI browser agents are large-language-model-driven systems that perceive web pages, plan actions, and execute them through a real browser to complete user tasks. They sit at the intersection of three fast-moving fields: agent reasoning (planning, memory, recovery), web interaction (DOM, vision, action grounding), and infrastructure (cloud browsers, sessions, identities). This category defines the vocabulary developers need to design and ship agents that actually work in production — covering computer use, natural-language automation, action and DOM grounding, agent observability, self-healing selectors, evaluation harnesses, and the trade-offs between vision-language and DOM-first approaches. If you're building anything that lets an LLM click, type, or read a webpage, the terms here are the foundation.

12 terms in this category

Other categories

Build your AI agent on the open web with Notte

Cloud browsers, agent identities, and the Anything API — everything you need to ship reliable browser agents in production.