Create Recipe

Karini AI's Recipe allows you to create your no-code generative AI application pipelines.

Creating a Recipe

To create a new recipe, go to the Recipe Page, click Add New, select Karini as the runtime option, provide a user-friendly name and detailed description and choose recipe type as QNA.

Configure the following elements by dragging them on to the recipe canvas.

Source

Define your data source by configuring the associated data storage connector. Select appropriate data connector from a list of available connectors.

Configure the storage paths for the connector and apply necessary filters to restrict the data being included in the source. Enable recursive search if needed to include data from nested directories or structures.

You have option to test your data connector setup using the "Test" button.

Dataset

Dataset server as internal collection of dataset items which are pointers to the data source. For a recipe, you can choose to use an existing dataset which may have references to other data sources or create a new one, depending on your needs.

Karini AI provides various options for data preprocessing.

OCR:

For source data which contains files of types pdf or image, you can perform Optical Character Recognition (OCR) by selecting one of the following options:

  • Unstructured IO with Extract Images: This method is used for extracting images from unstructured data sources. It processes unstructured documents, identifying and extracting images that can be further analyzed or used in different applications.

  • PyMuPDF with Fallback to Amazon Textract: This approach utilizes PyMuPDF to extract text and images from PDF documents. If PyMuPDF fails or is insufficient, the process falls back to Amazon Textract, ensuring a comprehensive extraction by leveraging Amazon's advanced OCR capabilities.

  • Amazon Textract with Extract Table: Amazon Textract is used to extract structured data, such as tables, from documents. This method specifically focuses on identifying and extracting tabular data, making it easier to analyze and use structured information from scanned documents or PDFs.

PII:

If you need to mask Personally Identifiable Information (PII) within your dataset, you can enable the PII Masking option. You can select from the list of entities that you want masked for data pre-processing. To learn more about the entities refer to this documentation.

Link your Source element to Dataset element in the recipe canvas to start creating your data ingestion pipeline.

Knowledge base

Select a VectorDB provider for your knowledge base, such as OpenSearch or Pinecone, which will manage and store your vector data.

Link the Dataset element to the Knowledge base element in the recipe canvas to link your data with the vector store.

Data Processing Preferences

When you link the Dataset element to the Knowledge base element, you can set the Data Processing Preferences that would define your vector embedding creation process. These preferences include:

Embedding Model:

You can choose from a selection of available embeddings models in the Karini AI Model hub.

Chunking Type:

You can define a chunking strategy for your unstructured data

  1. Recursive: This method divides data hierarchically or based on nested structures.

  2. Semantic: This method segments data based on semantic boundaries or logical breaks in the content.

Chunk Size:

Specify the size of each data segment processed by the embedding model.

Chunk Overlap:

Define the overlap between consecutive chunks of data segments. Overlapping helps ensure continuity and context preservation across chunks, especially in sequential or interconnected data.

Prompt

You can select from the existing prompts in the prompt playground to add to the recipe. A prompt is associated with a LLM and Model parameters.

Link the Knowledge base element to the Prompt element in the recipe canvas, establishing a link that allows the Prompt to access and utilize the knowledge base.

When you link the Vector store element to the Prompt element, you can set the context generation preferences that would define how your prompt is obtaining context for the user query.

These options provide several techniques to improve the relevance and quality of your vector search.

Use embedding chunks:

Choosing this option conducts a semantic search to retrieve the top_k most similar vector-embedded document chunks and uses these chunks to create a contextual prompt for the Large Language Model (LLM).

Summarize chunks:

Choosing this option conducts a semantic search to retrieve the top_k most similar vector-embedded document chunks and then summarizes these chunks to create a contextual prompt for the Large Language Model (LLM).

Use the document text for matching embeddings:

Choosing this feature conducts a semantic search to retrieve the top_k most similar vector-embedded document chunks and uses the corresponding text from the original document to create a contextual prompt for the Large Language Model (LLM). You can further restrict the context by selecting one of the following options:

  • Use entire document: Use the text from the entire document to create context for the prompt.

  • Use matching page: Use the text from matching page of the document to create context for the prompt. You can optionally include previous and next page to ensure continuity and context preservation.

Top_k:

Maximum number of top matching vectors to retrieve.

Enable Reranker:

Re-ranking improves search relevance by reordering the result set based on the relevancy score. In order to enable reranker, you must have set the reranker model in the Organization setting. You can configure following options for the reranker.

  • Top-N: Maximum number of top-ranking vectors to retrieve. This number must be lesser than the top_k parameter.

  • Reranker Threshold: A threshold for relevancy score. The reranker model will select Top-N vectors that are over the set the threshold.

Advanced query reconstruction:

Query rewriting can involve modifying or augmenting user queries to enhance retrieval accuracy. In order to use this option, you must have the Natural language assistant model configured in the Organization setting.

  • Multi query rewrite: Use this option when you want to break down complex or ambiguous queries into multiple distinct, simpler queries. You are provided with a sample prompt for this task, however you can update the prompt as required.

  • Query expansion: Use this option to expand the user queries by augmenting the query with a LLM generated information which can help answer the question. This technique is helpful for improving retrieval accuracy when user queries are short, abrupt and not specific. You are provided with a sample prompt for this task, however you can update the prompt as required.

Output

By adding an output element into the recipe you can test the recipe and analyze the responses. You can configure the following details on the output element.

Conversation History:

Determine the number of messages to retain in the conversation history. This allows you augment the prompt context with past conversation, improving the response quality.

Detect Greetings:

Use this option to determine if user question is a greeting. In order to use this option, you must have the Greetings detection model configured in the Organization. You are provided with a sample prompt for this task, however you can update the prompt as required. If the user question is classified as a greeting, you can configure a specific response to the greeting in the Recipe Export form.

Generate follow up questions:

Enable Generate follow-up questions to prompt the system to autonomously generate relevant questions based on the conversation context and generated answer. In order to use this option, you must have the Followup question generator model configured in the Organization. You are provided with a sample prompt for this task, however you can update the prompt as required.

Content Safety:

You can apply content safety guardrails to the user input and LLM generated output to filter harmful content. Amazon Comprehend's Trust and Safety feature is used to enforce these guardrails.

  • Detect input safety: Detect questions or content that has explicit or implicit malicious intent. Examples include discriminatory or illegal content, or content that expresses or requests advice on medical, legal, political, controversial, personal or financial subjects. Content will be classified as unsafe if the Unsafe content threshold is breached. User questions classified as unsafe will not be processed.

  • Detect input toxicity: Detect questions or content that may be harmful, offensive, or inappropriate. Examples include hate speech, threats, graphic speech or abuse. Content with toxicity score above the threshold value will not be processed.

  • Detect output toxicity: Detect LLM output content that may be harmful, offensive, or inappropriate. Examples include hate speech, threats, graphic speech or abuse. Content with toxicity score above the threshold value will not be displayed to the user.

Enable retrieval from semantic cache:

Select this option to retrieve precomputed answers from a cache instead of re-executing the entire recipe output generation process. Enabling semantic caching will store previously generated responses and serve them when similar queries are made.

  • Utilize local cache: Use the local semantic cache specific to your session for faster, personalized results.

Online Evaluation:

With online evaluation, you can perform a real-time evaluation of your copilot’s responses without needing a pre-existing evaluation dataset. You can select from the following evaluation metrics:

  1. Faithfulness Assessment: This feature automatically verifies that the responses generated by the copilot are accurately grounded in the relevant context retrieved from the knowledge base. This ensures that the copilot's answers are correct and reliable, significantly enhancing the credibility of the responses.

  2. Answer Relevance: This criterion evaluates the degree to which the copilot's answers address the user’s specific queries. By ensuring that each response is directly relevant to the question posed, it improves the overall effectiveness and user satisfaction with the copilot’s outputs.

  3. Context Relevance: This metric measures how well the retrieved context aligns with the query, ensuring the information provided is precise and contextually appropriate. This focus on context relevance leads to more accurate and helpful responses, particularly in complex queries where the specificity of information is crucial.

Once the recipe is deployed as a copilot(chatbot), these evaluation metrics will automatically apply to all copilot requests. Each user request processed by the copilot is dynamically evaluated against these metrics, with the evaluation scores being continuously updated in the copilot’s history.

Link the Prompt element to the Output element in the recipe canvas, establishing a link that allows the Output element to access the response generated by the LLM using the prompt.

Saving a Recipe

You can save the recipe at any point during the creation. Saving the recipe preserves all configurations and connections made in the workflow for future reference or deployment.

Last updated