Intelligent Document Processing
This hands-on lab guides you through creating an Intelligent Document Processing (IDP) system that automates extraction and processing using AI technologies. You'll build a complete workflow integrating OCR capabilities, natural language processing and webhook triggers.
Use Case
Build an Intelligent Document Processing system powered by advanced AI services for automated document extraction and analysis.
Prerequisites
S3 bucket with sample documents pre-loaded.
Webhook API access credentials are configured.
Step 1: Build Document Extraction Prompt in Prompt Playground
Navigate to the Prompt Playground.
Click Add new in the top right corner to create a new prompt.
Open Prompt templates and select the Po Extractor template. This template will create a new summarization prompt.
Rename your prompt as "Document Extractor".
Save the prompt.
Test & Compare :
In Test & Compare tab, select different models from the dropdown to test the prompt [Hint: Compare Claude Sonnet 3.7, Claude Sonnet 3.5 v2, Amazon Nova Pro].
Test and compare the prompt responses with the selected models and the guardrail.
Select the best-performing model as the Primary Model.
Optionally assign a Fallback Model.
Click the Save prompt run in the right corner to save the prompt run.
Save and publish the prompt.
Step 2: Create Notification Agent Prompt
Return to Prompt Playground.
Click Add new in the top corner to add a new prompt.
Open Prompt templates and select the Notify Agent template.
Once the template is selected, increase the Max State Updates from 3 to 20 for complex reasoning.
In the agent input field, use the following sample input for testing.
Name the prompt as Notification Agent and save the prompt.
Open your prompt and proceed to "Tools" tab to configure the following agent tools:
Messaging Tool:
Click Add new to create a new tool.
Name: "Messaging tool".
Description: "Sends email notifications about document processing status".
Type: Messaging
Messaging type: Email
In the Input Schema section, use the following email schema and save the tool.
Enter the following credentials
Email
Password
SMTP Server
SMTP Port
Recipient email address
Provide appropriate email subject and message body.
Test & Compare :
In Test & Compare tab, select different models from the dropdown to test the prompt
Click on Test button below and compare the agent responses with the selected models.
Click on Select as best answer to select the best-performing model as the Primary Model.
Optionally Select as best answer to assign a Fallback Model.
Click the Save prompt run in the right corner to save the prompt run.
Save and Publish the prompt.
Step 3: Create an Agent 2.0 Recipe
Navigate to the Recipes section.
Click Add New in the top right corner to create a new recipe.
Configure recipe details:
Name: Intelligent Document Processing.
Type: Agent 2.0.
Set up the the recipe workflow nodes by selecting each of the following elements and dragging them onto the recipe canvas:
Webhook:
From the right side panel, Copy the Webhook URL and Webhook Token to your notepad for later API use.
Select the Query method as POST.
Connector:
Select Amazon S3 as the data source.
Connect the Webhook node to Connector node.
Processing:
Enable the OCR option and select Amazon Textract.
Enable Extract Layouts and Extract Table.
Connect the Connector to the Processing node.
Start:
Connect the Processing node to the Start node.
Prompt:
Label: "Document Extractor".
From the prompt dropdown, select the document extraction prompt created earlier.
Scroll down and set the State settings:
Document Cache: Enable Document Cache and select the Type as Ephemeral.
Messages: Enable Messages, and select message context as Last Message.
Connect the Start node to the Prompt node.
Sink:
Connect the Prompt Node to the Sink Node to store JSON output.
Sink node Configuration:
S3 bucket path:
s3://karini-ai-workshop/idp-sink/
File name pattern: (Ex. processed/{filename_prefix}{current_datetime}.json )
po_processing/{filename_prefix}{current_datetime}.json.
Agent:
Label: "Notification Agent".
From the agents dropdown, select your notification agent created earlier.
Scroll down and set the State settings:
Document Cache: Enable Document Cache and select the Type as Ephemeral.
Messages: Enable Messages, and select message context as Last Message.
Enable metadata.
Connect the Document Extractor Prompt node to Notification Agent node.
End:
Connect the Notification Agent node to the End node.
On the right side, set the Number of state updates to a higher value such as 75.
Save and Publish the recipe by assigning an initial version.
Refer to the following video to create the recipe.
Step 4: Trigger Workflow via API
Open the linux terminal in your computer. Use the following curl command to initiate document processing [Hint: Ensure CURL command input payload is valid JSON]. Note the output of the command for next step.
Retrieve webhook request status by ID:
3.View all webhooks for a specific recipe:
Refer the video to trigger workflow.
Step 5: Monitor and Verify Results
Navigate to the recipe icon on the LHS toolbar
Click on the Actions button on the Intelligent Document Processing recipe.
Select Webhook History. Here, you can review the history of all webhook requests along with their inputs, response, status, tokens and detailed traces.
Additionally, you can access the S3 location of your Sink node and review JSON output.
This IDP workflow automates document processing from ingestion through extraction to notification, creating a complete intelligent document handling system.
Refer the video for reviewing the webhook history.
Last updated