Sim

Jina

Search the web or extract content from URLs

Jina AI is a powerful content extraction tool that seamlessly integrates with Sim to transform web content into clean, readable text. This integration allows developers to easily incorporate web content processing capabilities into their agentic workflows.

Jina AI Reader specializes in extracting the most relevant content from web pages, removing clutter, advertisements, and formatting issues to produce clean, structured text that's optimized for language models and other text processing tasks.

With the Jina AI integration in Sim, you can:

  • Extract clean content from any web page by simply providing a URL
  • Process complex web layouts into structured, readable text
  • Maintain important context while removing unnecessary elements
  • Prepare web content for further processing in your agent workflows
  • Streamline research tasks by quickly converting web information into usable data

This integration is particularly valuable for building agents that need to gather and process information from the web, conduct research, or analyze online content as part of their workflow.

Usage Instructions

Integrate Jina AI into the workflow. Search the web and get LLM-friendly results, or extract clean content from specific URLs with advanced parsing options.

Tools

jina_read_url

Extract and process web content into clean, LLM-friendly text using Jina AI Reader. Supports advanced content parsing, link gathering, and multiple output formats with configurable processing options.

Input

ParameterTypeRequiredDescription
urlstringYesThe URL to read and convert to markdown
useReaderLMv2booleanNoWhether to use ReaderLM-v2 for better quality (3x token cost)
gatherLinksbooleanNoWhether to gather all links at the end
jsonResponsebooleanNoWhether to return response in JSON format
apiKeystringYesYour Jina AI API key
withImagesummarybooleanNoGather all images from the page with metadata
retainImagesstringNoControl image inclusion: "none" removes all, "all" keeps all
returnFormatstringNoOutput format: markdown, html, text, screenshot, or pageshot
withIframebooleanNoInclude iframe content in extraction
withShadowDombooleanNoExtract Shadow DOM content
noCachebooleanNoBypass cached content for real-time retrieval
withGeneratedAltbooleanNoGenerate alt text for images using VLM
robotsTxtstringNoBot User-Agent for robots.txt checking
dntbooleanNoDo Not Track - prevents caching/tracking
noGfmbooleanNoDisable GitHub Flavored Markdown

Output

ParameterTypeDescription
contentstringThe extracted content from the URL, processed into clean, LLM-friendly text
linksarrayList of links found on the page (when gatherLinks or withLinksummary is enabled)
imagesarrayList of images found on the page (when withImagesummary is enabled)

Search the web and return top 5 results with LLM-friendly content. Each result is automatically processed through Jina Reader API. Supports geographic filtering, site restrictions, and pagination.

Input

ParameterTypeRequiredDescription
qstringYesSearch query string
apiKeystringYesYour Jina AI API key
numnumberNoMaximum number of results per page (default: 5)
sitestringNoRestrict results to specific domain(s). Can be comma-separated for multiple sites (e.g., "jina.ai,github.com")
withFaviconbooleanNoInclude website favicons in results
withImagesummarybooleanNoGather all images from result pages with metadata
withLinksummarybooleanNoGather all links from result pages
retainImagesstringNoControl image inclusion: "none" removes all, "all" keeps all
noCachebooleanNoBypass cached content for real-time retrieval
withGeneratedAltbooleanNoGenerate alt text for images using VLM
respondWithstringNoSet to "no-content" to get only metadata without page content
returnFormatstringNoOutput format: markdown, html, text, screenshot, or pageshot

Output

ParameterTypeDescription
resultsarrayArray of search results, each containing title, description, url, and LLM-friendly content

Notes

  • Category: tools
  • Type: jina
On this page

On this page

Start building today
Trusted by over 60,000 builders.
Build Agentic workflows visually on a drag-and-drop canvas or with natural language.
Get started