Sim

AWS Textract

Extract text, tables, and forms from documents

AWS Textract is a powerful AI service from Amazon Web Services designed to automatically extract printed text, handwriting, tables, forms, key-value pairs, and other structured data from scanned documents and images. Textract leverages advanced optical character recognition (OCR) and document analysis to transform documents into actionable data, enabling automation, analytics, compliance, and more.

With AWS Textract, you can:

  • Extract text from images and documents: Recognize printed text and handwriting in formats such as PDF, JPEG, PNG, or TIFF
  • Detect and extract tables: Automatically find tables and output their structured content
  • Parse forms and key-value pairs: Pull structured data from forms, including fields and their corresponding values
  • Identify signatures and layout features: Detect signatures, geometric layout, and relationships between document elements
  • Customize extraction with queries: Extract specific fields and answers using query-based extraction (e.g., "What is the invoice number?")

In Sim, the AWS Textract integration empowers your agents to intelligently process documents as part of their workflows. This unlocks automation scenarios such as data entry from invoices, onboarding documents, contracts, receipts, and more. Your agents can extract relevant data, analyze structured forms, and generate summaries or reports directly from document uploads or URLs. By connecting Sim with AWS Textract, you can reduce manual effort, improve data accuracy, and streamline your business processes with robust document understanding.

Usage Instructions

Integrate AWS Textract into your workflow to extract text, tables, forms, and key-value pairs from documents. Single-page mode supports JPEG, PNG, and single-page PDF. Multi-page mode supports multi-page PDF and TIFF.

Tools

textract_parser

Parse documents using AWS Textract OCR and document analysis

Input

ParameterTypeRequiredDescription
accessKeyIdstringYesAWS Access Key ID
secretAccessKeystringYesAWS Secret Access Key
regionstringYesAWS region for Textract service (e.g., us-east-1)
processingModestringNoDocument type: single-page or multi-page. Defaults to single-page.
filePathstringNoURL to a document to be processed (JPEG, PNG, or single-page PDF).
s3UristringNoS3 URI for multi-page processing (s3://bucket/key).
fileUploadobjectNoFile upload data from file-upload component
featureTypesarrayNoFeature types to detect: TABLES, FORMS, QUERIES, SIGNATURES, LAYOUT. If not specified, only text detection is performed.
itemsstringNoFeature type
queriesarrayNoCustom queries to extract specific information. Only used when featureTypes includes QUERIES.
itemsobjectNoQuery configuration
propertiesstringNoThe query text
TextstringNoNo description
AliasstringNoNo description

Output

ParameterTypeDescription
blocksarrayArray of Block objects containing detected text, tables, forms, and other elements
BlockTypestringType of block (PAGE, LINE, WORD, TABLE, CELL, KEY_VALUE_SET, etc.)
IdstringUnique identifier for the block
TextstringQuery text
TextTypestringType of text (PRINTED or HANDWRITING)
ConfidencenumberConfidence score (0-100)
PagenumberPage number
GeometryobjectLocation and bounding box information
BoundingBoxobjectHeight as ratio of document height
HeightnumberHeight as ratio of document height
LeftnumberLeft position as ratio of document width
TopnumberTop position as ratio of document height
WidthnumberWidth as ratio of document width
HeightnumberHeight as ratio of document height
LeftnumberLeft position as ratio of document width
TopnumberTop position as ratio of document height
WidthnumberWidth as ratio of document width
PolygonarrayPolygon coordinates
XnumberX coordinate
YnumberY coordinate
XnumberX coordinate
YnumberY coordinate
BoundingBoxobjectHeight as ratio of document height
HeightnumberHeight as ratio of document height
LeftnumberLeft position as ratio of document width
TopnumberTop position as ratio of document height
WidthnumberWidth as ratio of document width
HeightnumberHeight as ratio of document height
LeftnumberLeft position as ratio of document width
TopnumberTop position as ratio of document height
WidthnumberWidth as ratio of document width
PolygonarrayPolygon coordinates
XnumberX coordinate
YnumberY coordinate
XnumberX coordinate
YnumberY coordinate
RelationshipsarrayRelationships to other blocks
TypestringRelationship type (CHILD, VALUE, ANSWER, etc.)
IdsarrayIDs of related blocks
TypestringRelationship type (CHILD, VALUE, ANSWER, etc.)
IdsarrayIDs of related blocks
EntityTypesarrayEntity types for KEY_VALUE_SET (KEY or VALUE)
SelectionStatusstringFor checkboxes: SELECTED or NOT_SELECTED
RowIndexnumberRow index for table cells
ColumnIndexnumberColumn index for table cells
RowSpannumberRow span for merged cells
ColumnSpannumberColumn span for merged cells
QueryobjectQuery information for QUERY blocks
TextstringQuery text
AliasstringQuery alias
PagesarrayPages to search
AliasstringQuery alias
PagesarrayPages to search
documentMetadataobjectMetadata about the analyzed document
pagesnumberNumber of pages in the document
modelVersionstringVersion of the Textract model used for processing
On this page

On this page

Start building today
Trusted by over 60,000 builders.
Build Agentic workflows visually on a drag-and-drop canvas or with natural language.
Get started