Overview
Browser tools give the agent a full Chromium browser it can control programmatically. This is useful for pages that require JavaScript to render, login flows, form submission, and visual verification via screenshots. Browser tools are in the Full tier - they are only sent to frontier models (Claude, GPT-4o, Gemini) that can reliably reason about page state and UI interactions.Browser tools require Puppeteer to be installed. Run
pnpm install to install all optional dependencies, including Puppeteer.Available Tools
| Tool | Description | Security |
|---|---|---|
browser_navigate | Navigate to a URL | moderate |
browser_snapshot | Get the current page accessibility tree | safe |
browser_click | Click an element by CSS selector | moderate |
browser_type | Type text into an input field | moderate |
browser_search | Search the web using browser navigation | moderate |
browser_screenshot | Take a screenshot of the current page | safe |
browser_pages | List all open browser tabs/pages | safe |
browser_close | Close a page or the entire browser | moderate |
Tool Details
browser_navigate
Navigate to a URL. Opens a new browser page if none is active.
Full URL to navigate to. Must be
http:// or https://.Wait condition:
load, domcontentloaded, networkidle0, networkidle2.Navigation timeout in milliseconds.
browser_snapshot
Get the current page’s accessibility tree as structured text. More token-efficient than a screenshot for text-heavy pages.
Include hidden elements in the snapshot.
browser_click
Click an element on the page.
CSS selector for the element to click. Use
aria/Button Name for accessible selectors.Mouse button:
left, right, middle.Number of clicks (use 2 for double-click).
browser_type
Type text into an input or textarea.
CSS selector for the input element.
Text to type.
Clear existing content before typing.
Delay between keystrokes in milliseconds (simulates human typing).
browser_screenshot
Capture the current page as a PNG image.
Capture the full scrollable page, not just the viewport.
Capture only a specific element instead of the whole page.
JPEG quality (1-100). Only applies when format is
jpeg.browser_pages
List all currently open browser pages/tabs.
Returns page IDs, URLs, and titles. Use page IDs to target operations at specific tabs.
browser_close
Close a specific page or all browser pages.
ID of the specific page to close. Omit to close all pages.
Typical Workflow
Safe vs Full Browser Tools
By default, only “safe” browser tools (browser_snapshot, browser_screenshot, browser_pages) are included for non-frontier models. The full automation set requires a frontier model to reason about page state accurately.