Bright Data

Scrape websites, search engines, and extract structured data

Usage Instructions

Integrate Bright Data into the workflow. Scrape any URL with Web Unlocker, search Google and other engines with SERP API, discover web content ranked by intent, or trigger pre-built scrapers for structured data extraction.

Tools

brightdata_scrape_url

Fetch content from any URL using Bright Data Web Unlocker. Bypasses anti-bot protections, CAPTCHAs, and IP blocks automatically.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
zonestringYesWeb Unlocker zone name from your Bright Data dashboard (e.g., "web_unlocker1")
urlstringYesThe URL to scrape (e.g., "https://example.com/page"\)
formatstringNoResponse format: "raw" for HTML or "json" for parsed content. Defaults to "raw"
countrystringNoTwo-letter country code for geo-targeting (e.g., "us", "gb")

Output

ParameterTypeDescription
contentstringThe scraped page content (HTML or JSON depending on format)
urlstringThe URL that was scraped
statusCodenumberHTTP status code of the response

Search Google, Bing, DuckDuckGo, or Yandex and get structured search results using Bright Data SERP API.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
zonestringYesSERP API zone name from your Bright Data dashboard (e.g., "serp_api1")
querystringYesThe search query (e.g., "best project management tools")
searchEnginestringNoSearch engine to use: "google", "bing", "duckduckgo", or "yandex". Defaults to "google"
countrystringNoTwo-letter country code for localized results (e.g., "us", "gb")
languagestringNoTwo-letter language code (e.g., "en", "es")
numResultsnumberNoNumber of results to return (e.g., 10, 20). Defaults to 10

Output

ParameterTypeDescription
resultsarrayArray of search results
titlestringTitle of the search result
urlstringURL of the search result
descriptionstringSnippet or description of the result
ranknumberPosition in search results
querystringThe search query that was executed
searchEnginestringThe search engine that was used

brightdata_discover

AI-powered web discovery that finds and ranks results by intent. Returns up to 1,000 results with optional cleaned page content for RAG and verification.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
querystringYesThe search query (e.g., "competitor pricing changes enterprise plan")
numResultsnumberNoNumber of results to return, up to 1000. Defaults to 10
intentstringNoDescribes what the agent is trying to accomplish, used to rank results by relevance (e.g., "find official pricing pages and change notes")
includeContentbooleanNoWhether to include cleaned page content in results
formatstringNoResponse format: "json" or "markdown". Defaults to "json"
languagestringNoSearch language code (e.g., "en", "es", "fr"). Defaults to "en"
countrystringNoTwo-letter ISO country code for localized results (e.g., "us", "gb")

Output

ParameterTypeDescription
resultsarrayArray of discovered web results ranked by intent relevance
urlstringURL of the discovered page
titlestringPage title
descriptionstringPage description or snippet
relevanceScorenumberAI-calculated relevance score for intent-based ranking
contentstringCleaned page content in the requested format (when includeContent is true)
querystringThe search query that was executed
totalResultsnumberTotal number of results returned

brightdata_sync_scrape

Scrape URLs synchronously using a Bright Data pre-built scraper and get structured results directly. Supports up to 20 URLs with a 1-minute timeout.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
datasetIdstringYesDataset scraper ID from your Bright Data dashboard (e.g., "gd_l1viktl72bvl7bjuj0")
urlsstringYesJSON array of URL objects to scrape, up to 20 (e.g., [{"url": "https://example.com/product"\}\]\)
formatstringNoOutput format: "json", "ndjson", or "csv". Defaults to "json"
includeErrorsbooleanNoWhether to include error reports in results

Output

ParameterTypeDescription
dataarrayArray of scraped result objects with fields specific to the dataset scraper used
snapshotIdstringSnapshot ID returned if the request exceeded the 1-minute timeout and switched to async processing
isAsyncbooleanWhether the request fell back to async mode (true means use snapshot ID to retrieve results)

brightdata_scrape_dataset

Trigger a Bright Data pre-built scraper to extract structured data from URLs. Supports 660+ scrapers for platforms like Amazon, LinkedIn, Instagram, and more.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
datasetIdstringYesDataset scraper ID from your Bright Data dashboard (e.g., "gd_l1viktl72bvl7bjuj0")
urlsstringYesJSON array of URL objects to scrape (e.g., [{"url": "https://example.com/product"\}\]\)
formatstringNoOutput format: "json" or "csv". Defaults to "json"

Output

ParameterTypeDescription
snapshotIdstringThe snapshot ID to retrieve results later
statusstringStatus of the scraping job (e.g., "triggered", "running")

brightdata_snapshot_status

Check the progress of an async Bright Data scraping job. Returns status: starting, running, ready, or failed.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
snapshotIdstringYesThe snapshot ID returned when the collection was triggered (e.g., "s_m4x7enmven8djfqak")

Output

ParameterTypeDescription
snapshotIdstringThe snapshot ID that was queried
datasetIdstringThe dataset ID associated with this snapshot
statusstringCurrent status of the snapshot: "starting", "running", "ready", or "failed"

brightdata_download_snapshot

Download the results of a completed Bright Data scraping job using its snapshot ID. The snapshot must have ready status.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
snapshotIdstringYesThe snapshot ID returned when the collection was triggered (e.g., "s_m4x7enmven8djfqak")
formatstringNoOutput format: "json", "ndjson", "jsonl", or "csv". Defaults to "json"
compressbooleanNoWhether to compress the results

Output

ParameterTypeDescription
dataarrayArray of scraped result records
formatstringThe content type of the downloaded data
snapshotIdstringThe snapshot ID that was downloaded

brightdata_cancel_snapshot

Cancel an active Bright Data scraping job using its snapshot ID. Terminates data collection in progress.

Input

ParameterTypeRequiredDescription
apiKeystringYesBright Data API token
snapshotIdstringYesThe snapshot ID of the collection to cancel (e.g., "s_m4x7enmven8djfqak")

Output

ParameterTypeDescription
snapshotIdstringThe snapshot ID that was cancelled
cancelledbooleanWhether the cancellation was successful

On this page