Description
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with the ability to control a real web browser and interact with websites using three core primitives:- Act: Perform actions like clicking, typing, or navigating
- Extract: Extract structured data from web pages
- Observe: Identify and analyze elements on the page
Requirements
Before using this tool, you will need:- A Browserbase account with API key and project ID
- An API key for an LLM (OpenAI or Anthropic Claude)
- The Stagehand Python SDK installed
Usage
Basic Usage
Command Types
The StagehandTool supports three different command types, each designed for specific web automation tasks:1. Act - Perform Actions on a Page
Theact command type (default) allows the agent to perform actions on a webpage, such as clicking buttons, filling forms, navigating, and more.
When to use: Use act when you need to interact with a webpage by performing actions like clicking, typing, scrolling, or navigating.
Example usage:
2. Extract - Get Data from a Page
Theextract command type allows the agent to extract structured data from a webpage, such as product information, article text, or table data.
When to use: Use extract when you need to retrieve specific information from a webpage in a structured format.
Example usage:
3. Observe - Identify Elements on a Page
Theobserve command type allows the agent to identify and analyze specific elements on a webpage, returning information about their attributes, location, and suggested actions.
When to use: Use observe when you need to identify UI elements, understand page structure, or determine what actions are possible.
Example usage:
Advanced Configuration
You can customize the behavior of the StagehandTool by specifying different parameters:Task Examples for CrewAI Agents
Here are some examples of tasks that effectively use the StagehandTool:Tips for Effective Use
- Be specific in instructions: The more specific your instructions, the better the results. For example, instead of “click the button,” use “click the ‘Submit’ button at the bottom of the contact form.”
- Use the right command type: Choose the appropriate command type based on your task:
- Use
actfor interactions and navigation - Use
extractfor gathering information - Use
observefor understanding page structure
- Use
- Leverage selectors: When extracting data or observing elements, use CSS selectors to narrow the scope and improve accuracy.
- Handle multi-step processes: For complex workflows, break them down into multiple tool calls, each handling a specific step.
- Error handling: Implement appropriate error handling in your agent’s logic to deal with potential issues like elements not found or pages not loading.
Troubleshooting
- Session not starting: Ensure you have valid API keys for both Browserbase and your LLM provider.
- Elements not found: Try increasing the
dom_settle_timeout_msparameter to give the page more time to load. - Actions not working: Make sure your instructions are clear and specific. You may need to use
observefirst to identify the correct elements. - Extract returning incomplete data: Try refining your instruction or providing a more specific selector.

