Karini AI Documentation
Go Back to Karini AI
  • Introduction
  • Installation
  • Getting Started
  • Organization
  • User Management
    • User Invitations
    • Role Management
  • Model Hub
    • Embeddings Models
    • Large Language Models (LLMs)
  • Prompt Management
    • Prompt Templates
    • Create Prompt
    • Test Prompt
      • Test & Compare
      • Prompt Observability
      • Prompt Runs
    • Agentic Prompts
      • Create Agent Prompt
      • Test Agent Prompt
    • Prompt Task Types
    • Prompt Versions
  • Datasets
  • Recipes
    • QnA Recipe
      • Data Storage Connectors
      • Connector Credential Setup
      • Vector Stores
      • Create Recipe
      • Run Recipe
      • Test Recipe
      • Evaluate Recipe
      • Export Recipe
      • Recipe Runs
      • Recipe Actions
    • Agent Recipe
      • Agent Recipe Configuration
      • Set up Agentic Recipe
      • Test Agentic Recipe
      • Agentic Evaluation
    • Databricks Recipe
  • Copilots
  • Observability
  • Dashboard Overview
    • Statistical Overview
    • Cost & Usage Summary
      • Spend by LLM Endpoint
      • Spend by Generative AI Application
    • Model Endpoints & Datasets Distribution
    • Dataset Dashboard
    • Copilot Dashboard
    • Model Endpoints Dashboard
  • Catalog Schemas
    • Connectors
    • Catalog Schema Import and Publication Process
  • Prompt Optimization Experiments
    • Set up and execute experiment
    • Optimization Insights
  • Generative AI Workshop
    • Agentic RAG
    • Intelligent Document Processing
    • Generative BI Agentic Assistant
  • Release Notes
Powered by GitBook
On this page
  • Optical Character Recognition (OCR)
  • Personally Identifiable Information (PII)
  • Chunking
  • Embeddings
  • Dataset Used in Batch Recipe
  • Batch-chain
  1. Dashboard Overview

Dataset Dashboard

PreviousModel Endpoints & Datasets DistributionNextCopilot Dashboard

Last updated 10 months ago

On the Datasets page, you can select a pre-created dataset and view dashboards that give you insights into the data processing tasks.

Below is an example of a dataset dashboard and the associated each processing tasks.

Optical Character Recognition (OCR)

  • OCR enables the extraction of text from images or scanned documents, making the data more accessible and searchable.

  • Count: Number of dataset items processed using OCR.

  • Processing Status: Indicates whether the OCR task was successful or if there were errors during processing.

Personally Identifiable Information (PII)

  • PII handling involves identifying and managing data that could potentially identify a specific individual, such as names, social security numbers, addresses, etc.

  • Count: Number of dataset items scanned for PII.

  • Status: Indicates success or errors in identifying and handling PII.

Chunking

  • Chunking is the process of splitting documents into smaller, manageable pieces, called chunks, which can be processed independently.

  • Count: Number of dataset items that underwent the chunking process.

  • Status: Indicates success or errors in chunking.

Embeddings

  • Embeddings are vector representations of data, such as words, sentences, or images, that capture the semantic meaning and relationships within the data.

    • Count: Number dataset items that underwent the embeddings generation process.

    • Status: Indicates success or errors in generating embeddings.

Dataset Used in Batch Recipe

For the datasets that are used in Batch recipe, you see an additional chart of batch_chain in the datasets dashboard.

Batch-chain

  • Batch-chains refer to the sequence of tasks processed in batches to improve efficiency and manageability. This includes grouping data for processing and ensuring each step in the sequence is completed successfully.

  • Count: Number of dataset items processed in the batch-chain.

  • Status: Indicates whether each task in the batch-chain was successful or if errors were encountered.

Dataset used in a QNA recipe
Dataset used in batch recipe