How Notte Works

Understand how agents think, observe, and act — and how the perception layer helps LLMs interact with real web pages.

The Notte Architecture

Notte gives structure to the web, allowing language models to reason and act like agents. Instead of working with messy raw HTML, it builds a simplified and readable representation of every webpage so an LLM can understand what's on screen and decide what to do next.

Agent: interprets your task and manages the reasoning loop
Session: launches a real browser and keeps it alive
Perception Layer: converts DOM into a list of possible actions
LLM: chooses the best next step (click, fill, etc.) based on context

Each time your agent runs, it observes the page, reasons about what to do, and acts — all while logging every decision.

import notte
import asyncio

async def run():
    async with notte.Session() as page:
        obs = await page.observe("https://www.google.com/travel/flights")
        print(obs.space.markdown)

asyncio.run(run())

Explore with Perception

You can use Notte's perception system outside of an agent to explore how the page is structured.

Create a file observe.py:

import notte
import asyncio

async def run():
    async with notte.Session() as page:
        obs = await page.observe("https://www.google.com/travel/flights")
        print(obs.space.markdown)

asyncio.run(run())

Run the Script

Run it:

python observe.py

View the Output

Sample output:

This is the structured representation that your agent uses to plan and act.

# Flight Search
* I1: Enters departure location
* I3: Selects departure date
* B3: Search flights options with current filters