Karini AI Documentation
Go Back to Karini AI
  • Introduction
  • Installation
  • Getting Started
  • Organization
  • User Management
    • User Invitations
    • Role Management
  • Model Hub
    • Embeddings Models
    • Large Language Models (LLMs)
  • Prompt Management
    • Prompt Templates
    • Create Prompt
    • Test Prompt
      • Test & Compare
      • Prompt Observability
      • Prompt Runs
    • Agentic Prompts
      • Create Agent Prompt
      • Test Agent Prompt
    • Prompt Task Types
    • Prompt Versions
  • Datasets
  • Recipes
    • QnA Recipe
      • Data Storage Connectors
      • Connector Credential Setup
      • Vector Stores
      • Create Recipe
      • Run Recipe
      • Test Recipe
      • Evaluate Recipe
      • Export Recipe
      • Recipe Runs
      • Recipe Actions
    • Agent Recipe
      • Agent Recipe Configuration
      • Set up Agentic Recipe
      • Test Agentic Recipe
      • Agentic Evaluation
    • Databricks Recipe
  • Copilots
  • Observability
  • Dashboard Overview
    • Statistical Overview
    • Cost & Usage Summary
      • Spend by LLM Endpoint
      • Spend by Generative AI Application
    • Model Endpoints & Datasets Distribution
    • Dataset Dashboard
    • Copilot Dashboard
    • Model Endpoints Dashboard
  • Catalog Schemas
    • Connectors
    • Catalog Schema Import and Publication Process
  • Prompt Optimization Experiments
    • Set up and execute experiment
    • Optimization Insights
  • Generative AI Workshop
    • Agentic RAG
    • Intelligent Document Processing
    • Generative BI Agentic Assistant
  • Release Notes
Powered by GitBook
On this page
  • Total Cost
  • Total API Requests
  • Total Tokens
  1. Dashboard Overview
  2. Cost & Usage Summary

Spend by LLM Endpoint

PreviousCost & Usage SummaryNextSpend by Generative AI Application

Last updated 11 months ago

Cost and usage details for the registered model endpoints can be visualized using the following dashboards: Total Cost, Total API Requests, and Total API Tokens. These dashboards provide valuable insights into the operational metrics and usage patterns of your registered model endpoints, helping you to understand the usage patterns, identify the cost drivers and manage and optimize their cost effectively.

The total cost is proportional to the number of API requests and tokens processed. When creating embeddings, more requests or longer texts increase the cost. Similarly, when LLM generates the response, the cost is proportional to the number of input and output tokens processed.

Total Cost

You can visualize total cost of the registered model endpoints within your Organization. You have an option to chose all, or a set of specific endpoints from the dropdown of available endpoints. You can also select appropriate date range and granularity of daily or monthly.

The model price set in the , and the number of Input/Output tokens are used to calculate the endpoint cost for the selected date range.

  • This graph represents the total cost incurred by the model endpoints within your organization for the selected date range.

  • By default, when no endpoint is selected from the dropdown, the graph displays the costs associated with the Top 5 endpoints, with expenses for all other endpoints consolidated under Others.

  • The right-hand side features a color-coded legend, identifying each of the Top 5 endpoints and Others for easy reference.

  • When you select the endpoint(s) from the dropdown, the cost of the selected model endpoint(s) is displayed, accompanied by a color-coded legend on the right-hand side of the graph.

  • When Administrator adjusts the date filter or granularity, the dashboard dynamically refreshes to display the corresponding data.

Total API Requests

  • This graph displays the total number of API requests made for the selected model endpoints within the selected date range.

  • By default, when there is no endpoint is selected from the dropdown, this graph displays the Total API Requests associated with All the endpoints.

  • When you select the endpoint(s) from the dropdown, the Total API Requests of the selected model endpoint(s) is displayed.

  • This visualization helps in tracking the volume of requests and observing trends or patterns over the selected time frame.

Total Tokens

  • This graph shows the total number of tokens consumed (Input tokens) and generated (Output tokens) by the model endpoint(s) during the selected timeframe.

  • By default, when there is no endpoint is selected from the dropdown, the graph displays the Total Tokens associated with All the endpoints.

  • When you select the endpoint(s) from the dropdown, the Total Tokens of the selected model endpoint(s) is displayed.

model hub