Back to tutorials
How Notte Works
Understand how agents think, observe, and act — and how the perception layer helps LLMs interact with real web pages.
The Notte Architecture
Notte gives structure to the web, allowing language models to reason and act like agents. Instead of working with messy raw HTML, it builds a simplified and readable representation of every webpage so an LLM can understand what's on screen and decide what to do next.
When you run an agent:
- Agent: interprets your task and manages the reasoning loop
- Session: launches a real browser and keeps it alive
- Perception Layer: converts DOM into a list of possible actions
- LLM: chooses the best next step (click, fill, etc.) based on context
Each time your agent runs, it observes the page, reasons about what to do, and acts — all while logging every decision.
1
Explore with Perception
You can use Notte's perception system outside of an agent to explore how the page is structured.
Create a file observe.py
:
import notte
import asyncio
async def run():
async with notte.Session() as page:
obs = await page.observe("https://www.google.com/travel/flights")
print(obs.space.markdown)
asyncio.run(run())
2
Run the Script
Run it:
python observe.py
3
View the Output
Sample output:
# Flight Search
* I1: Enters departure location
* I3: Selects departure date
* B3: Search flights options with current filters
This is the structured representation that your agent uses to plan and act.