BrowserUse is a powerful browser automation platform that enables you to create and run browser tasks programmatically. It provides a way to automate web interactions through natural language instructions, allowing you to navigate websites, fill forms, extract data, and perform complex sequences of actions without writing code.
With BrowserUse, you can:
- Automate web interactions: Navigate to websites, click buttons, fill forms, and perform other browser actions
- Extract data: Scrape content from websites, including text, images, and structured data
- Execute complex workflows: Chain multiple actions together to complete sophisticated web tasks
- Monitor task execution: Watch browser tasks run in real-time with visual feedback
- Process results programmatically: Receive structured output from web automation tasks
In Sim, the BrowserUse integration allows your agents to interact with the web as if they were human users. This enables scenarios like research, data collection, form submission, and web testing - all through simple natural language instructions. Your agents can gather information from websites, interact with web applications, and perform actions that would typically require manual browsing, expanding their capabilities to include the entire web as a resource.
Usage Instructions
Integrate Browser Use into the workflow. Can navigate the web and perform actions as if a real user was interacting with the browser.
Tools
browser_use_run_task
Runs a browser automation task using BrowserUse
Input
| Parameter | Type | Required | Description |
|---|---|---|---|
task | string | Yes | What should the browser agent do |
startUrl | string | No | Initial page URL to start the agent on (reduces navigation steps) |
variables | json | No | Optional secrets injected into the task (format: {key: value}) |
allowedDomains | string | No | Comma-separated list of domains the agent is allowed to visit |
maxSteps | number | No | Maximum number of steps the agent may take (default 100, max 10000) |
flashMode | boolean | No | Enable flash mode (faster, less careful navigation) |
thinking | boolean | No | Enable extended reasoning mode |
vision | string | No | Vision capability: "true", "false", or "auto" |
systemPromptExtension | string | No | Optional text appended to the agent system prompt (max 2000 chars) |
structuredOutput | string | No | Stringified JSON schema for the structured output |
highlightElements | boolean | No | Highlight interactive elements on the page (default true) |
metadata | json | No | Custom key-value metadata (up to 10 pairs) for tracking |
model | string | No | LLM model identifier (e.g. browser-use-2.0) |
apiKey | string | Yes | API key for BrowserUse API |
profile_id | string | No | Browser profile ID for persistent sessions (cookies, login state) |
Output
| Parameter | Type | Description |
|---|---|---|
id | string | Task execution identifier |
success | boolean | Task completion status |
output | json | Final task output (string or structured) |
steps | array | Steps the agent executed (number, memory, nextGoal, url, actions, duration) |
↳ number | number | Sequential step number |
↳ memory | string | Agent memory at this step |
↳ evaluationPreviousGoal | string | Evaluation of previous goal completion |
↳ nextGoal | string | Goal for the next step |
↳ url | string | Current URL of the browser |
↳ screenshotUrl | string | Optional screenshot URL |
↳ actions | array | Stringified JSON actions performed |
↳ duration | number | Step duration in seconds |
liveUrl | string | Embeddable live browser session URL (active during execution) |
shareUrl | string | Public shareable URL for the recorded session (post-run) |
sessionId | string | Browser Use session identifier |