Structured scraping
Scrape quote author profiles
Crawl quote pages, follow author links, and extract structured author biographies and quote metadata.
Run this template
Clone just this template, configure Notte, and start the run.
Before running
- Have
NOTTE_API_KEYready. Generate an API key.
Need help? Join the Notte Slack.
Extraction path
- Uses a Notte-hosted browser session controlled through Playwright.
- No AI required at runtime: selectors extract quotes, pagination, and author profiles.
- Uses inline uv script metadata, so no template-specific pyproject.toml is required.
- quotes: quote text, author, author URL, tags, and source page.
Query controls
- NOTTE_API_KEY: required Notte API key.
- BASE_URL: target site, defaults to https://quotes.toscrape.com/.
- MAX_PAGES: quote listing pages to visit, defaults to 2.
- INCLUDE_AUTHOR_PROFILES: true or false, defaults to true.
Scrape reliability notes
- Missing credentials: verify .env contains NOTTE_API_KEY.
- Selector changes: this template expects the Quotes to Scrape HTML structure.
- Large runs: increase MAX_PAGES and MAX_AUTHORS gradually when testing.