Agent Recipe
Karini AI’s Agent 2.0 recipe is an advanced Generative AI workflow framework designed to build, automate, and optimize intelligent workflows. It enables the creation of AI-driven agents capable of handling complex processes, decision-making, and seamless system integrations. Incorporating structured prompting, dynamic routing, and advanced data processing ensures efficient information management and adaptability to diverse requirements. Its modular and flexible architecture allows for customization, scalability and automation, enhancing overall workflow efficiency. Additionally, it supports large-scale data handling, real-time interactions, and streamlined output management, making it a versatile solution for AI-powered automation across various applications
To create a new agentic workflow recipe, go to the recipe page, click Add new, select the appropriate runtime option, provide a user-friendly name and detailed description, and choose Agent 2.0 for the recipe type.
Agent 2.0 recipes can be initiated with either a Chat node or a Webhook node, depending on the specific use case and the desired interaction flow.
Chat
This node is designed to initiate and facilitate user interactions, serving as the entry point for conversational engagement.
Conversation History:
Determine the number of messages to retain in the conversation history. This allows you to augment the prompt context with past conversation, improving the response quality.
Generate follow up questions:
Enable Generate follow-up questions to prompt the system to autonomously generate relevant questions based on the conversation context and generated answer. To use this option, the Follow Up question generator model must be configured in the Organization.
You are provided with a sample prompt for this task; however, you can update the as required.
Enable Audio Mode
When Audio mode is enabled, the audio option becomes available on Copilots. This allows users to interact through voice queries and receive responses in both text and audio formats, enhancing accessibility and user experience.
You must have Speech to Text Model Endpoint and Text to Speech Model Endpoint configured in the Organization.
Webhook
A webhook is a user-defined HTTP callback that allows the system to send automated messages or data to an external URL when an event occurs.
Here are the key elements of a webhook node.
Label: Serves as a unique identifier for the webhook, making it easy to reference or manage.
Webhook URL: The endpoint to which the webhook sends HTTP requests. It directs the data to the appropriate destination.
Webhook Token: Used for authentication to ensure that the request made by the webhook is valid and authorized to access the API.
Query Method: Specifies the HTTP method (such as POST) used for the request to the API.
Webhook Query Headers: Defines the headers included in the request, often containing metadata like content type, authorization, or other necessary information for the API to process the request.
The webhook request must include the following headers for authentication and content type specification:
Payload Template: A predefined structure or format for the data sent with the request. It helps in organizing the information and ensuring that the API receives the correct data structure.
Start
The Start node serves as the entry point of the workflow, initiating the flow of tasks by connecting to various functional nodes based on specific process requirements.
It can link to the Knowledge base node for information retrieval, the Router node for directing execution based on logic, the Prompt node for generating responses, the Custom function node for executing predefined tasks, the Agent node for intelligent automation, and the Transform node for enabling parallel processing. This flexibility allows workflows to be dynamically structured according to operational needs, ensuring efficient execution and automation.
Knowledge Base
A Knowledge base is a systematically structured repository designed to store, organize, and manage information, enabling applications to retrieve relevant data efficiently.
The system supports the following two types of knowledge bases:
Native Knowledge Base: Choose a dataset from the available dataset list. After selecting the dataset, the relevant prompt contexts can be configured to retrieve information from the vector store. For detailed guidance, refer to Context Generation using Vector Search.
Bedrock Knowledge Base - The Bedrock Knowledge Base refers to an AI-powered retrieval system offered by Amazon Bedrock, a service by AWS (Amazon Web Services). It allows users to integrate enterprise knowledge bases with AI-powered applications, enabling natural language queries on stored data. For more details refer Amazon Bedrock Knowledge Bases.
The configuration fields include:
Knowledge Base ID: Enter the ID of the knowledge base.
Filter Key: Specifies the criteria for filtering results.
Filter Value: Defines the specific value to filter by.
Number of Results: Specifies how many responses should be retrieved.
Overwrite Credentials Option : By default, the AWS credentials configured in Organization settings will be used to invoke AWS resource. However, if needed, you can provide alternate AWS credentials.
The following state flags need to be configured based on the use case .
State settings
State settings control data access within the workflow, ensuring that nodes can retrieve and process relevant information.
Document Cache: Provides access to shared document information across nodes.
Retrieve Documents: Fetches the entire document based on a specified filename or file path. Useful for accessing full document content from the knowledge base
Retrieve Chunks: Fetches specific document chunks based on semantic similarity, ideal for retrieving only relevant parts of a document related to the query.
Ephemeral: Pass context as raw text directly with the query without storing it in the knowledge base.
Messages: Accesses the conversation history for processing, with options for full, last, or specific node messages.
All Messages: Accesses the entire conversation history, allowing the node to consider all previous interactions for context.
Last Message: Accesses only the most recent message in the conversation, useful for nodes that need to respond to the latest input.
Node Message: Accesses messages from a specific node in the workflow, ideal for retrieving targeted information shared by a particular node.
Metadata: Provides access to webhook metadata from connected APIs, enabling external data flow.
Scratchpad: A Scratchpad in agents refers to a temporary storage area where an AI agent keeps track of its intermediate thoughts, steps, or actions while processing a task. It helps the agent plan, reason, and track progress when solving complex problems, especially in multi-step reasoning or decision-making scenarios.
Connector
The Connector node functions as an interface facilitating data exchange between the system and external storage solutions.
There are two available connector types, as listed below.
Amazon S3 : Enables integration with Amazon Simple Storage Service (S3) for retrieving or storing data.
Overwrite Credentials :By default, the AWS credentials configured in Organization settings will be used to invoke AWS resources. However, if needed, you can provide alternate AWS credentials
In Memory Base64 Data : Supports handling Base64-encoded data stored in memory for temporary or intermediate processing.
Processing
The Processing node is employed in the agent recipe to enable file uploads to Copilot for querying. It streamlines the processing of uploaded files by providing configurable options designed to support specific data extraction and privacy requirements.
Karini AI recipes support following preprocessing options:
Enable Transcriptions: Enables automatic transcription, facilitating the conversion of audio or speech-based files into text.
Default: Using the Default method, the OpenAI Whisper model is employed and should be selected as the Speech-to-Text Model Endpoint for transcription tasks in the Organization page.
Amazon Transcribe: Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text.
OCR Options: This option provides various methods for extracting text from documents and images:
Unstructured IO with Extract Images: This method is used for extracting images from unstructured data sources. It processes unstructured documents, identifying and extracting images that can be further analyzed or used in different applications.
PyMuPDF with Fallback to Amazon Textract: This approach utilizes PyMuPDF to extract text and images from PDF documents. If PyMuPDF fails or is insufficient, the process falls back to Amazon Textract, ensuring a comprehensive extraction by leveraging Amazon's advanced OCR capabilities.
Amazon Textract: The selected OCR method is Amazon Textract, a cloud-based service that identifies and extracts text, structured data, and elements like tables and forms from documents.
Extract Layouts:
This option helps recognize the structural layout of a document, such as:
Headings
Paragraphs
Columns
Useful for document formatting retention.
Extract Tables:
This option allows structured table extraction, preserving row and column relationships.
Useful for processing invoices, reports, and tabular data.
Extract Forms:
this setting extracts key-value pairs from documents, such as:
Form fields and their corresponding values.
Useful for processing application forms, contracts, and structured documents.
Tesseract: An open-source OCR engine for extracting text from images and PDFs.
VLM: A specialized method for processing and extracting text from images or documents using visual language models. You must have VLM configured in the Organization.
The VLM Prompt provides a predefined instruction set guiding the AI model on how to analyze the image and what details to extract.
The VLM Prompt instructs the system to analyze an image in-depth, extracting all visible text while preserving structure and order. Additionally, it provides a detailed description of diagrams, graphs, or scenes, explaining components, relationships, and inferred meanings to ensure a comprehensive textual representation of the image.
The VLM prompt is defined as follows:
PII Masking Options: To mask Personally Identifiable Information (PII) within your dataset, enable the PII Masking option. You can specify the entities to be masked by selecting from the available list, ensuring secure data preprocessing.
For more details, refer to this PII entities.
Router
Router directs the workflow to the next node based on conditions or input, enabling dynamic branching paths within the agent recipe. This ensures that the workflow can adapt based on the given data or context, enhancing flexibility and decision-making in the process.
Node Methods: The Router supports three node methods for decision-making.
Default: This method follows the standard routing logic, processing data without additional customization. It adheres to a predefined flow, ensuring consistency and simplicity for straightforward workflows that don't require dynamic decision-making.
Using the Default method, routing conditions can be assigned to edges to determine the appropriate node for processing the request.
There are two available options:
Default Routing: which applies when no specific conditions are met.
Custom Routing-:where you can define explicit conditions for each edge in the provided text box.
Prompt:
The Prompt method enables the selection of a predefined prompt for the Router node from the available prompt list.
The selected prompt contains instructions that guide the router on how to direct the workflow.
The router will evaluate the input and route the workflow accordingly based on the prompt’s logic.
Here is the provided sample prompt:
Lambda: The Lambda method integrates AWS Lambda functions to execute custom logic.
Lambda ARN : Enter the Amazon Resource Name (ARN) of the Lambda function you want to invoke. This uniquely identifies the Lambda function within AWS.
Input test payload: This is a sample payload that will be used to test the Lambda function. This helps ensure the function behaves as expected with the provided input.
Test button : Enables you to validate the function by executing the test payload.
Overwrite credentials option → By default, the AWS credentials configured in Organization settings will be used to invoke AWS resource. However, if needed, you can provide alternate AWS credentials
Invoke Retries specifies the number of retry attempts when an execution fails, ensuring improved reliability and fault tolerance in processing.
The State settings are detailed in the preceding section; refer to it for comprehensive information.
Prompt
The Prompt Node is enabling the system to execute logic-driven actions based on predefined prompts. You can select a prompt from the existing prompts in the prompt playground to add to the recipe.
Once a prompt is selected, the system displays the associated primary and fallback models, along with guardrails attached if they were configured in the Prompt Playground.
Additionally, you can navigate to the Prompt Playground by clicking the redirect icon, allowing them to view the prompt details and make necessary modifications.
The following image illustrates the version and redirect icon.
To switch to a specific version, click on the displayed version. This will generate a list of all associated versions. Select the required version, and the system will load the complete prompt details corresponding to the selected version.
The following image displays the associated versions.
Refer sample prompt.
The Guardrail option is available on this tile. You can choose from existing guardrails in the Prompt Playground, which will be reflected in the recipe upon prompt selection. Alternatively, you may enable the default guardrail configured at the organizational level.
Invoke Retries specifies the number of retry attempts when an execution fails, ensuring improved reliability and fault tolerance in processing.
The State settings are detailed in the preceding section; refer to it for comprehensive information.
Ensure that the scratchpad is enabled and that the prompt includes Scratchpad as a variable.
Agent
Select an agent prompt from the available options. Each prompt encompasses pre-configured tools and settings essential for processing inquiries and responding with effectiveness.
Once you've selected the prompt, the canvas will reveal the tools and configurations integrated into the agent prompt. You can update the configurations for the preset tools. You cannot add or delete an agent tool from the recipe canvas. In order to edit tool types, or add or delete tools, you need to edit the agent prompt in the prompt playground. These tools empower the agent to analyze queries thoroughly and generate precise responses.
Once an agent is selected, the system displays the associated primary and fallback models along with current version. Additionally, you can navigate to the Prompt Playground by clicking the redirect icon, allowing them to view the agent details and make necessary modifications.
The following image illustrates the version and redirect icon.
To switch to a specific version, click on the displayed version. This will generate a list of all associated versions. Select the required version, and the system will load the complete agent details corresponding to the selected version.
The following image displays the associated versions.
Refer sample agent prompt.
Invoke Retries specifies the number of retry attempts when an execution fails, ensuring improved reliability and fault tolerance in processing.
The State settings are detailed in the preceding section; refer to it for comprehensive information.
Ensure that the scratchpad is enabled and that the agent prompt includes Scratchpad as a variable.
End
Marks the conclusion of a workflow, signaling that no further actions are required.
Sink
Integrating the Sink node into a workflow facilitates the secure storage, export, or transmission of processed data from upstream nodes to a designated destination. This ensures that the final output is efficiently managed, preserved, and made available for further use or analysis. The Sink node is designed to seamlessly store structured or unstructured data in cloud storage solutions like Amazon S3 or database.
There are two output types:
Amazon S3
S3 Bucket Path : Provide the target Amazon S3 bucket path where the data will be stored.
File Name Pattern : Allows you to define a structured naming pattern for saved files.
Note: Files are saved in .json format by default.
Save Raw File : The raw file is saved without additional processing.
Message : Retrieves only the most recent message(Last message) in the conversation, making it ideal for nodes that require processing the latest user input.
Lambda : Data will be pre-processed within an AWS Lambda function before transmission to the output , enabling dynamic transformation such as formatting, filtering, enrichment, and other rule-based modifications to ensure data integrity and compliance with business logic.
Lambda ARN : Enter the Amazon Resource Name (ARN) of the Lambda function you want to invoke. This uniquely identifies the Lambda function within AWS.
Input Variables:
Defines the parameters or variables to be passed to the Lambda function.
These variables allow for dynamic data handling and contextual processing.
Input test payload: This is a sample payload that will be used to test the Lambda function. This helps ensure the function behaves as expected with the provided input.
Test button : Enables you to validate the function by executing the test payload.
Message : Retrieves only the most recent message(Last message) in the conversation, making it ideal for nodes that require processing the latest user input.
Transform
The Transform module provides a Split and Merge node that enables users to manipulate data by either splitting it into smaller chunks or merging multiple segments. This functionality is particularly useful in data processing workflows where structured transformation of information is required.
There are two available node methods, listed as follows:
Split : When using the Split node method, input data is divided based on a user-specified strategy. The split operation enhances data processing, retrieval, and transformation by ensuring that each segment adheres to the selected criteria. The method supports various strategies for data segmentation, including
Character- Splits the data at the character level.
Words- Divides the text into word-based segments.
Lambda- Uses a custom AWS Lambda function to determine the split logic.
Transform Type : Lambda function receives an event payload where the entire input is wrapped under the key input. Ensure your function extracts and processes data accordingly.
Lambda ARN : Enter the Amazon Resource Name (ARN) for the AWS Lambda function that will process and split the data.
Input Test Payload: Enter test data to validate the Lambda function’s behavior before deployment.
Test Button : Allows you to execute a test run of the configured Lambda function for validation.
Overwrite Credentials (Optional) → Allows you to override existing authentication settings with new credentials.
Pages - Splits content based on document pagination.
Chunk Size: The Number of characters or words or pages allowed for each chunk.
Scratchpad:. It serves as a temporary storage or intermediary space for processing and managing data within the workflow. The Split operation utilizes an input method, meaning it receives the output from the preceding Scratchpad as its input for the current Split process.
Merge: The Merge node in the Transform module is used to combine multiple data segments into a unified structure. This is particularly useful when working with split data that needs to be reconstructed or when consolidating multiple data sources into a single format.
Merge Strategy :The Merge strategy determines the format in which the data will be merged. The available options include:
JSON → Combines data in a structured JSON format.
Lambda → Utilizes a custom AWS Lambda function to programmatically merge data.
Lambda ARN: Provide AWS Lambda function’s Amazon Resource Name (ARN).
Input Test Payload: Sample input data to test the transformation logic.
Test Button: Allows you to validate the function’s processing behavior.
Overwrite Credentials (Optional) → Allows you to override existing authentication settings with new credentials.
Text : Merges content into a plain text format.
Merge method -The Merge method determines how the merging process handles existing data. Two options are available:
Overwrite- Replaces any existing data with the newly merged data. This ensures that only the most recent merged version is retained.
Append- Instead of replacing, new data is added to an existing array or list within the JSON structure.
Scratchpad - Temporary storage is used to retain intermediate data before final output.
Output: The merged data is written to an output location.
Method Selection:
Overwrite: Replaces the existing data.
Extend: Allows appending new data instead of replacing it.
Save and publish recipe
Saving the recipe preserves all configurations and connections made in the workflow for future reference or deployment.
Once a recipe is created and saved, you need to publish it to assign it a version number.
Test recipe
A recipe can be tested by bringing and configuring the Output element into the recipe canvas (see Create Recipe).
Click the Test button to open a chat window, allowing interaction through queries. Submit your question review the response generated by the recipe. The response includes the following:
Answer
You can see the real-time response from your recipe pipeline that includes the answer to the question, prompt lens icon, a trace icon, and statistics. If the model selected in the prompt for the recipe supports streaming, you will see a streaming response.
Prompt Lens
Prompt lens let you peek behind the scenes as the request is being executed. Here, you can inspect the input we are sending to the language models (LLMs) - including system instructions, questions, and prompt. This empowers you to analyze the quality of your retrieved context from the vector store and make necessary adjustments to the context generation strategy if needed.
You can view the streaming in the prompt lens. Once the response in the prompt lens is completed, it auto-refreshes, and then the answer is displayed in the chat widget.
You can view the information in the prompt lens after the request is processed.
Agent scratch pad: The scratch pad aids in refining prompts, documenting interactions, or brainstorming ideas based on the outputs received from the selected models.
Agent response: The agent response refers to the output or action taken by the selected model in response to a user's prompt or query.
Tool response: It gives the insights or summaries related to the tools used, performance metrics, or operational status.
- Trace:
Trace has two sections as Prompt and Attributes.
Prompt: You can view the traces of each operation executed during the processing . It includes the following:
Input
Output
Attributes: These include various parameters and metrics associated with each request. Some of the attributes include:
Input Tokens
Completion tokens
Model parameters such as temperature, max tokens etc.
Statistics
You can view the following statistics when the response is generated after a test.
LLM Response Time: The amount of time in milliseconds taken by the LLM to generate complete response for the given prompt request.
LLM Request Timestamp: Represents the specific time a request was made to the Language Learning Model (LLM).
Time to First Token: The time that it takes for the model to produce the first token of the response after receiving the prompt. TTFT is particularly relevant for applications utilizing streaming, where providing immediate feedback is crucial.
Input Tokens: Total number of input tokens in the LLM request. This includes the prompt instructions, system prompt, context and user query.
Output Tokens: Total number of output tokens generated by the LLM in response to the prompt request. This number does not exceed the Max Tokens value setup during the prompt testing.
Input Unsafety Score: It measures the potential risk or danger associated with a given input. A higher score indicates a greater level of unsafety.
Input Toxicity Score: This score represents the likelihood that the input text could be perceived as toxic or harmful.
Export Recipe
To export the recipe and deploy copilot, please refer to the detailed instructions provided in Export Recipe section. This section includes step-by-step guidelines that will guide you through the entire process, ensuring accuracy and efficiency.
Copilots
To explore the various features and functionalities offered by Copilot, including its capabilities, settings, and customization options, please refer to the detailed section titled Copilots.
Last updated