AWS Textract
Extract text, tables, and forms from documents
AWS Textract is a powerful AI service from Amazon Web Services designed to automatically extract printed text, handwriting, tables, forms, key-value pairs, and other structured data from scanned documents and images. Textract leverages advanced optical character recognition (OCR) and document analysis to transform documents into actionable data, enabling automation, analytics, compliance, and more.
With AWS Textract, you can:
- Extract text from images and documents: Recognize printed text and handwriting in formats such as PDF, JPEG, PNG, or TIFF
- Detect and extract tables: Automatically find tables and output their structured content
- Parse forms and key-value pairs: Pull structured data from forms, including fields and their corresponding values
- Identify signatures and layout features: Detect signatures, geometric layout, and relationships between document elements
- Customize extraction with queries: Extract specific fields and answers using query-based extraction (e.g., "What is the invoice number?")
In Sim, the AWS Textract integration empowers your agents to intelligently process documents as part of their workflows. This unlocks automation scenarios such as data entry from invoices, onboarding documents, contracts, receipts, and more. Your agents can extract relevant data, analyze structured forms, and generate summaries or reports directly from document uploads or URLs. By connecting Sim with AWS Textract, you can reduce manual effort, improve data accuracy, and streamline your business processes with robust document understanding.
Usage Instructions
Integrate AWS Textract into your workflow to extract text, tables, forms, and key-value pairs from documents. Single-page mode supports JPEG, PNG, and single-page PDF. Multi-page mode supports multi-page PDF and TIFF.
Tools
textract_parser
Parse documents using AWS Textract OCR and document analysis
Input
| Parameter | Type | Required | Description |
|---|---|---|---|
accessKeyId | string | Yes | AWS Access Key ID |
secretAccessKey | string | Yes | AWS Secret Access Key |
region | string | Yes | AWS region for Textract service (e.g., us-east-1) |
processingMode | string | No | Document type: single-page or multi-page. Defaults to single-page. |
filePath | string | No | URL to a document to be processed (JPEG, PNG, or single-page PDF). |
s3Uri | string | No | S3 URI for multi-page processing (s3://bucket/key). |
fileUpload | object | No | File upload data from file-upload component |
featureTypes | array | No | Feature types to detect: TABLES, FORMS, QUERIES, SIGNATURES, LAYOUT. If not specified, only text detection is performed. |
items | string | No | Feature type |
queries | array | No | Custom queries to extract specific information. Only used when featureTypes includes QUERIES. |
items | object | No | Query configuration |
properties | string | No | The query text |
Text | string | No | No description |
Alias | string | No | No description |
Output
| Parameter | Type | Description |
|---|---|---|
blocks | array | Array of Block objects containing detected text, tables, forms, and other elements |
↳ BlockType | string | Type of block (PAGE, LINE, WORD, TABLE, CELL, KEY_VALUE_SET, etc.) |
↳ Id | string | Unique identifier for the block |
↳ Text | string | Query text |
↳ TextType | string | Type of text (PRINTED or HANDWRITING) |
↳ Confidence | number | Confidence score (0-100) |
↳ Page | number | Page number |
↳ Geometry | object | Location and bounding box information |
↳ BoundingBox | object | Height as ratio of document height |
↳ Height | number | Height as ratio of document height |
↳ Left | number | Left position as ratio of document width |
↳ Top | number | Top position as ratio of document height |
↳ Width | number | Width as ratio of document width |
↳ Height | number | Height as ratio of document height |
↳ Left | number | Left position as ratio of document width |
↳ Top | number | Top position as ratio of document height |
↳ Width | number | Width as ratio of document width |
↳ Polygon | array | Polygon coordinates |
↳ X | number | X coordinate |
↳ Y | number | Y coordinate |
↳ X | number | X coordinate |
↳ Y | number | Y coordinate |
↳ BoundingBox | object | Height as ratio of document height |
↳ Height | number | Height as ratio of document height |
↳ Left | number | Left position as ratio of document width |
↳ Top | number | Top position as ratio of document height |
↳ Width | number | Width as ratio of document width |
↳ Height | number | Height as ratio of document height |
↳ Left | number | Left position as ratio of document width |
↳ Top | number | Top position as ratio of document height |
↳ Width | number | Width as ratio of document width |
↳ Polygon | array | Polygon coordinates |
↳ X | number | X coordinate |
↳ Y | number | Y coordinate |
↳ X | number | X coordinate |
↳ Y | number | Y coordinate |
↳ Relationships | array | Relationships to other blocks |
↳ Type | string | Relationship type (CHILD, VALUE, ANSWER, etc.) |
↳ Ids | array | IDs of related blocks |
↳ Type | string | Relationship type (CHILD, VALUE, ANSWER, etc.) |
↳ Ids | array | IDs of related blocks |
↳ EntityTypes | array | Entity types for KEY_VALUE_SET (KEY or VALUE) |
↳ SelectionStatus | string | For checkboxes: SELECTED or NOT_SELECTED |
↳ RowIndex | number | Row index for table cells |
↳ ColumnIndex | number | Column index for table cells |
↳ RowSpan | number | Row span for merged cells |
↳ ColumnSpan | number | Column span for merged cells |
↳ Query | object | Query information for QUERY blocks |
↳ Text | string | Query text |
↳ Alias | string | Query alias |
↳ Pages | array | Pages to search |
↳ Alias | string | Query alias |
↳ Pages | array | Pages to search |
documentMetadata | object | Metadata about the analyzed document |
↳ pages | number | Number of pages in the document |
modelVersion | string | Version of the Textract model used for processing |