Browser Use

Run browser automation tasks

BrowserUse is a powerful browser automation platform that enables you to create and run browser tasks programmatically. It provides a way to automate web interactions through natural language instructions, allowing you to navigate websites, fill forms, extract data, and perform complex sequences of actions without writing code.

With BrowserUse, you can:

  • Automate web interactions: Navigate to websites, click buttons, fill forms, and perform other browser actions
  • Extract data: Scrape content from websites, including text, images, and structured data
  • Execute complex workflows: Chain multiple actions together to complete sophisticated web tasks
  • Monitor task execution: Watch browser tasks run in real-time with visual feedback
  • Process results programmatically: Receive structured output from web automation tasks

In Sim, the BrowserUse integration allows your agents to interact with the web as if they were human users. This enables scenarios like research, data collection, form submission, and web testing - all through simple natural language instructions. Your agents can gather information from websites, interact with web applications, and perform actions that would typically require manual browsing, expanding their capabilities to include the entire web as a resource.

Usage Instructions

Integrate Browser Use into the workflow. Can navigate the web and perform actions as if a real user was interacting with the browser.

Tools

browser_use_run_task

Runs a browser automation task using BrowserUse

Input

ParameterTypeRequiredDescription
taskstringYesWhat should the browser agent do
startUrlstringNoInitial page URL to start the agent on (reduces navigation steps)
variablesjsonNoOptional secrets injected into the task (format: {key: value})
allowedDomainsstringNoComma-separated list of domains the agent is allowed to visit
maxStepsnumberNoMaximum number of steps the agent may take (default 100, max 10000)
flashModebooleanNoEnable flash mode (faster, less careful navigation)
thinkingbooleanNoEnable extended reasoning mode
visionstringNoVision capability: "true", "false", or "auto"
systemPromptExtensionstringNoOptional text appended to the agent system prompt (max 2000 chars)
structuredOutputstringNoStringified JSON schema for the structured output
highlightElementsbooleanNoHighlight interactive elements on the page (default true)
metadatajsonNoCustom key-value metadata (up to 10 pairs) for tracking
modelstringNoLLM model identifier (e.g. browser-use-2.0)
apiKeystringYesAPI key for BrowserUse API
profile_idstringNoBrowser profile ID for persistent sessions (cookies, login state)

Output

ParameterTypeDescription
idstringTask execution identifier
successbooleanTask completion status
outputjsonFinal task output (string or structured)
stepsarraySteps the agent executed (number, memory, nextGoal, url, actions, duration)
numbernumberSequential step number
memorystringAgent memory at this step
evaluationPreviousGoalstringEvaluation of previous goal completion
nextGoalstringGoal for the next step
urlstringCurrent URL of the browser
screenshotUrlstringOptional screenshot URL
actionsarrayStringified JSON actions performed
durationnumberStep duration in seconds
liveUrlstringEmbeddable live browser session URL (active during execution)
shareUrlstringPublic shareable URL for the recorded session (post-run)
sessionIdstringBrowser Use session identifier

On this page