Spend by LLM Endpoint

PreviousCost & Usage Summary NextSpend by Generative AI Application

Last updated 11 months ago

Spend by LLM Endpoint

Cost and usage details for the registered model endpoints can be visualized using the following dashboards: Total Cost, Total API Requests, and Total API Tokens. These dashboards provide valuable insights into the operational metrics and usage patterns of your registered model endpoints, helping you to understand the usage patterns, identify the cost drivers and manage and optimize their cost effectively.

The total cost is proportional to the number of API requests and tokens processed. When creating embeddings, more requests or longer texts increase the cost. Similarly, when LLM generates the response, the cost is proportional to the number of input and output tokens processed.

Total Cost

You can visualize total cost of the registered model endpoints within your Organization. You have an option to chose all, or a set of specific endpoints from the dropdown of available endpoints. You can also select appropriate date range and granularity of daily or monthly.

The model price set in the , and the number of Input/Output tokens are used to calculate the endpoint cost for the selected date range.

This graph represents the total cost incurred by the model endpoints within your organization for the selected date range.
By default, when no endpoint is selected from the dropdown, the graph displays the costs associated with the Top 5 endpoints, with expenses for all other endpoints consolidated under Others.
The right-hand side features a color-coded legend, identifying each of the Top 5 endpoints and Others for easy reference.
When you select the endpoint(s) from the dropdown, the cost of the selected model endpoint(s) is displayed, accompanied by a color-coded legend on the right-hand side of the graph.
When Administrator adjusts the date filter or granularity, the dashboard dynamically refreshes to display the corresponding data.

Total API Requests

This graph displays the total number of API requests made for the selected model endpoints within the selected date range.
By default, when there is no endpoint is selected from the dropdown, this graph displays the Total API Requests associated with All the endpoints.
When you select the endpoint(s) from the dropdown, the Total API Requests of the selected model endpoint(s) is displayed.
This visualization helps in tracking the volume of requests and observing trends or patterns over the selected time frame.

Total Tokens

This graph shows the total number of tokens consumed (Input tokens) and generated (Output tokens) by the model endpoint(s) during the selected timeframe.
By default, when there is no endpoint is selected from the dropdown, the graph displays the Total Tokens associated with All the endpoints.
When you select the endpoint(s) from the dropdown, the Total Tokens of the selected model endpoint(s) is displayed.