Create Recipe
Last updated
Last updated
Karini AI's Recipe allows you to create your no-code generative AI application pipelines.
To create a new recipe, go to the Recipe Page, click Add New, select Karini as the runtime option, provide a user-friendly name and detailed description and choose recipe type as QNA.
Configure the following elements by dragging them on to the recipe canvas.
Define your data source by configuring the associated data storage connector. Select appropriate data connector from a list of available connectors.
Configure the storage paths for the connector and apply necessary filters to restrict the data being included in the source. Enable recursive search if needed to include data from nested directories or structures.
You have option to test your data connector setup using the "Test" button.
Dataset server as internal collection of dataset items which are pointers to the data source. For a recipe, you can choose to use an existing dataset which may have references to other data sources or create a new one, depending on your needs.
Karini AI provides various options for data preprocessing.
For source data which contains files of types pdf or image, you can perform Optical Character Recognition (OCR) by selecting one of the following options:
Unstructured IO with Extract Images: This method is used for extracting images from unstructured data sources. It processes unstructured documents, identifying and extracting images that can be further analyzed or used in different applications.
PyMuPDF with Fallback to Amazon Textract: This approach utilizes PyMuPDF to extract text and images from PDF documents. If PyMuPDF fails or is insufficient, the process falls back to Amazon Textract, ensuring a comprehensive extraction by leveraging Amazon's advanced OCR capabilities.
Amazon Textract with Extract Table: Amazon Textract is used to extract structured data, such as tables, from documents. This method specifically focuses on identifying and extracting tabular data, making it easier to analyze and use structured information from scanned documents or PDFs.
Custom preprocessor: This option allows the user to apply custom logic or transformations to the OCR-generated text. OCR text often contains noise, inaccuracies, or irrelevant information, and preprocessing helps to clean and structure the text to better serve specific requirements.
Document Types: The feature supports OCR extraction from various document formats, including PDFs, PNG images, and TXT files. It is versatile in handling different types of documents and outputs generated through OCR.
AWS Lambda Integration: The feature allows users to configure a custom AWS Lambda function for preprocessing tasks. By providing the Lambda function’s ARN (Amazon Resource Name), users can trigger the function, which executes the custom logic for processing the text. Lambda allows for highly scalable, serverless execution of functions with flexible integration, which is ideal for handling complex or computationally intensive tasks.
Lambda ARN : Enter the Amazon Resource Name (ARN) for the AWS Lambda function that will process and split the data.
Input Test Payload: Enter test data to validate the Lambda function’s behavior before deployment.
Test Button : Allows you to execute a test run of the configured Lambda function for validation.
Overwrite Credentials (Optional) → Allows you to override existing authentication settings with new credentials.
Page Range Specification: Users can specify which pages of the document should be processed. This is useful when only certain sections of a document need to be preprocessed or when processing large documents, allowing for efficient handling of specific pages or page ranges. For example, you can choose to process pages 1–5 or select all pages.
Metadata extractor: This feature uses a prompt-driven approach to extract entities and map them to appropriate data types, ensuring compliance with search engine requirements. The output is a clean, valid JSON format, optimized for indexing, querying, and downstream analysis.
You can define entities along with their corresponding data types to enable targeted extraction. If no entities are specified, the system will automatically identify and extract relevant entities based on the input text.
You can specify which pages of the document should be processed. This is useful when only certain sections of a document need to be preprocessed or when processing large documents, allowing for efficient handling of specific pages or page ranges. For example, you can choose to process pages 1–5 or select all pages.
OpenSearch is the VectorDB provider for your knowledge base, responsible for managing and storing your vector data.
When you link the Dataset element to the Knowledge base element, you can set the Data Processing Preferences that would define your vector embedding creation process. These preferences include:
You can define a chunking strategy for your unstructured data
Recursive: This method divides data hierarchically or based on nested structures.
Semantic: This method segments data based on semantic boundaries or logical breaks in the content.
Layout aware chunking: This method divides data based on the layout and visual structure, preserving the spatial relationships within the content.
Specify the size of each data segment processed by the embedding model.
Define the overlap between consecutive chunks of data segments. Overlapping helps ensure continuity and context preservation across chunks, especially in sequential or interconnected data.
Only published prompts are available for use in recipes.
When you link the Vector store element to the Prompt element, you can set the context generation preferences that would define how your prompt is obtaining context for the user query.
These options provide several techniques to improve the relevance and quality of your vector search.
Choosing this option conducts a semantic search to retrieve the top_k most similar vector-embedded document chunks and uses these chunks to create a contextual prompt for the Large Language Model (LLM).
Choosing this option conducts a semantic search to retrieve the top_k most similar vector-embedded document chunks and then summarizes these chunks to create a contextual prompt for the Large Language Model (LLM).
Choosing this feature conducts a semantic search to retrieve the top_k most similar vector-embedded document chunks and uses the corresponding text from the original document to create a contextual prompt for the Large Language Model (LLM). You can further restrict the context by selecting one of the following options:
Use entire document: Use the text from the entire document to create context for the prompt.
Use matching page: Use the text from matching page of the document to create context for the prompt. You can optionally include previous and next page to ensure continuity and context preservation.
Maximum number of top matching vectors to retrieve.
Top-N: Maximum number of top-ranking vectors to retrieve. This number must be lesser than the top_k parameter.
Reranker Threshold: A threshold for relevancy score. The reranker model will select Top-N vectors that are over the set the threshold.
Multi query rewrite: Use this option when you want to break down complex or ambiguous queries into multiple distinct, simpler queries. You are provided with a sample prompt for this task, however you can update the prompt as required.
Query expansion: Use this option to expand the user queries by augmenting the query with a LLM generated information which can help answer the question. This technique is helpful for improving retrieval accuracy when user queries are short, abrupt and not specific. You are provided with a sample prompt for this task, however you can update the prompt as required.
Enable ACL restriction:
Enable ACL restriction refers to filtering the knowledge base based on the user's Access Control List (ACL) permissions. Here's a explained how it works:
ACL-Based Filtering: When this feature is enabled, the content that is retrieved from the knowledge base will first be filtered based on the permissions assigned to the user. This means only content that the user is allowed to access will be considered.
Semantic Retrieval: After the ACL filter, the system performs semantic similarity-based retrieval to ensure that the content retrieved is relevant to the user's query.
Security Enhancement: This feature enhances security by ensuring that users can only access content that is permissible according to their ACL. It prevents unauthorized access to sensitive or restricted information by filtering out content the user shouldn't be able to access.
Enable dynamic metadata filtering:
This feature utilizes a Large Language Model (LLM) to generate custom metadata filters. Here's how it works:
Automatic Metadata Filtering: When enabled, the system analyzes metadata keys along with the user's input query. Based on this analysis, it generates dynamic metadata filters that narrow down the knowledge base.
Context-Aware: The metadata filters are designed to be dynamic and context-aware. This means that the filters adjust based on the query and the context of the user's request, ensuring a more accurate retrieval process.
Semantic Retrieval: Once the metadata filters are applied, the system performs retrieval based on semantic similarity. This allows for more focused and precise results.
By adding an output element into the recipe you can test the recipe and analyze the responses. You can configure the following details on the output element.
Determine the number of messages to retain in the conversation history. This allows you augment the prompt context with past conversation, improving the response quality.
Detect input safety: Detect questions or content that has explicit or implicit malicious intent. Examples include discriminatory or illegal content, or content that expresses or requests advice on medical, legal, political, controversial, personal or financial subjects. Content will be classified as unsafe if the Unsafe content threshold is breached. User questions classified as unsafe will not be processed.
Detect input toxicity: Detect questions or content that may be harmful, offensive, or inappropriate. Examples include hate speech, threats, graphic speech or abuse. Content with toxicity score above the threshold value will not be processed.
Detect output toxicity: Detect LLM output content that may be harmful, offensive, or inappropriate. Examples include hate speech, threats, graphic speech or abuse. Content with toxicity score above the threshold value will not be displayed to the user.
Select this option to retrieve precomputed answers from a cache instead of re-executing the entire recipe output generation process. Enabling semantic caching will store previously generated responses and serve them when similar queries are made.
Utilize local cache: Use the local semantic cache specific to your session for faster, personalized results.
Faithfulness Assessment: This feature automatically verifies that the responses generated by the copilot are accurately grounded in the relevant context retrieved from the knowledge base. This ensures that the copilot's answers are correct and reliable, significantly enhancing the credibility of the responses.
Answer Relevance: This criterion evaluates the degree to which the copilot's answers address the user’s specific queries. By ensuring that each response is directly relevant to the question posed, it improves the overall effectiveness and user satisfaction with the copilot’s outputs.
Context Relevance: This metric measures how well the retrieved context aligns with the query, ensuring the information provided is precise and contextually appropriate. This focus on context relevance leads to more accurate and helpful responses, particularly in complex queries where the specificity of information is crucial.
Once the recipe is deployed as a copilot(chatbot), these evaluation metrics will automatically apply to all copilot requests. Each user request processed by the copilot is dynamically evaluated against these metrics, with the evaluation scores being continuously updated in the copilot’s history.
The Audio Mode feature enables the system to convert spoken language into written text, leveraging advanced speech recognition and speech-to-text technologies. It allows users to interact with the system through voice commands or speech input.
You can save the recipe at any point during the creation. Saving the recipe preserves all configurations and connections made in the workflow for future reference or deployment.
If you need to mask Personally Identifiable Information (PII) within your dataset, you can enable the PII Masking option. You can select from the list of entities that you want masked for data pre-processing. To learn more about the entities refer to this .
In order to use this option, you must have the Custom metadata extraction model configured in the setting.
The View Custom Metadata button is located in the Knowledgebase section. When accessed before recipe processing, it will not display any metadata. Upon completion of recipe processing, the button will display either the or the extracted through the metadata extractor prompt available on the dataset tile. For additional details, please refer to the provided links.
You can choose from a selection of available embeddings models in the Karini AI .
You can choose from the existing prompts in the to add to the recipe. When a prompt is added, the corresponding primary and fallback models, along with their parameters, will be displayed based on the selections made in the playground. The models are read-only within the recipe, and cannot be changed directly. To use specific model with configurations, you must select, test, and publish the models in the prompt playground before applying the respective versions in the recipe.
Re-ranking improves search relevance by reordering the result set based on the relevancy score. In order to enable reranker, you must have set the reranker model in the setting. You can configure following options for the reranker.
Query rewriting can involve modifying or augmenting user queries to enhance retrieval accuracy. In order to use this option, you must have the Natural language assistant model configured in the setting.
Use this option to determine if user question is a greeting. In order to use this option, you must have the Greetings detection model configured in the . You are provided with a sample prompt for this task, however you can update the prompt as required. If the user question is classified as a greeting, you can configure a specific response to the greeting in the form.
Enable Generate follow-up questions to prompt the system to autonomously generate relevant questions based on the conversation context and generated answer. In order to use this option, you must have the Followup question generator model configured in the . You are provided with a sample prompt for this task, however you can update the prompt as required.
You can apply content safety guardrails to the user input and LLM generated output to filter harmful content. Amazon Comprehend's feature is used to enforce these guardrails.
With online evaluation, you can perform a real-time evaluation of your responses without needing a pre-existing evaluation dataset. You can select from the following evaluation metrics:
In order to use this option, you must have the Speech to Text Model Endpoint and Text to Speech Model Endpoint configured in the .