Observability

Every request in Karini AI includes a trace that outlines the steps orchestrated by the prompt, agent, recipe, or copilot. This trace allows you to follow the step-by-step process leading to the response at that point in the conversation.

Tracing Step
Prompt
Attributes

Detect Greeting Questions

  1. Input : Greeting detection prompt with input question

  1. Output: Classification output

  1. ServiceName- Information about the application in the resource

  2. SpanName - Internal function name

  3. gen_ai.prompt.0.role -

  4. gen_ai.completion.0.finish_reason -

  5. gen_ai.completion.0.role -

  6. gen_ai.openai.api_base -

  7. gen_ai.openai.system_fingerprint -

  8. gen_ai.request.max_tokens -The maximum number of response tokens requested

  9. gen_ai.request.model - The model requested (e.g. gpt-4, claude, etc.)

  10. gen_ai.request.temperature

  11. gen_ai.system - The vendor of the LLM (e.g. OpenAI, Anthropic, etc.)

  12. gen_ai.usage.completion_tokens - The number of tokens used for the completion response

  13. gen_ai.usage.prompt_tokens - The number of tokens used for the prompt in the request

  14. llm.headers -

  15. llm.is_streaming -

  16. llm.request.type - The type of request (e.g. completion, chat, etc.)

  17. llm.usage.total_tokens - The total number of tokens used

Check Content Safety

  1. Input : User query

  2. Output : Content Safety Check Output

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

Query Embeddings

  1. Input : User query

  2. Output : Vector embeddings of the user query

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

  3. gen_ai.openai.api_base -

  4. gen_ai.request.model - The model requested (e.g. gpt-4, claude, etc.)

  5. gen_ai.response.model - The model actually used (e.g. gpt-4-0613, etc.)

  6. gen_ai.system - The vendor of the LLM (e.g. OpenAI, Anthropic, etc.)

  7. gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request

  8. llm.headers - The headers used for the request

  9. llm.is_streaming -

  10. llm.request.type - The type of request (e.g. completion, chat, etc.)

  11. llm.usage.total_tokens - The total number of tokens used

Get similar embeddings

  1. Input : User query

  2. Output : Similar embeddings from the vector store

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

Perform reranking

  1. Input : User query

  2. Output : Reranked similar embeddings using Cohere reranker

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

Get Qna chain streaming

  1. Input : Prompt, user query and the reranked context

  2. Output : Response from the LLM

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

  3. gen_ai.completion.0.finish_reason -

  4. gen_ai.completion.0.role -

  5. gen_ai.openai.api_base -

  6. gen_ai.openai.api_version -

  7. gen_ai.prompt.0.role

  8. gen_ai.request.max_tokens -The maximum number of response tokens requested

  9. gen_ai.request.model - The model requested (e.g. gpt-4, claude, etc.)

  10. gen_ai.request.temperature -

  11. gen_ai.response.model - The model actually used (e.g. gpt-4-0613, etc.)

  12. gen_ai.system - The vendor of the LLM (e.g. OpenAI, Anthropic, etc.)

  13. gen_ai.usage.completion_tokens - The number of tokens used for the completion response

  14. gen_ai.usage.prompt_tokens - The number of tokens used for the prompt in the request

  15. llm.headers -

  16. llm.is_streaming -

  17. llm.request.type - The type of request (e.g. completion, chat, etc.)

  18. llm.usage.total_tokens - The total number of tokens used

Get Followup Questions

  1. Input : Follow up question generation prompt, user query and LLM generated answer to the user query

  2. Output : Followup questions

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

  3. gen_ai.completion.0.finish_reason -

  4. gen_ai.completion.0.role -

  5. gen_ai.openai.api_base -

  6. gen_ai.openai.system_fingerprint

  7. gen_ai.openai.api_version -

  8. gen_ai.prompt.0.role -

  9. gen_ai.request.max_tokens -The maximum number of response tokens requested

  10. gen_ai.request.model -The model actually used (e.g. gpt-4-0613, etc.)

  11. gen_ai.request.temperature -

  12. gen_ai.response.model -

  13. gen_ai.system -The vendor of the LLM (e.g. OpenAI, Anthropic, etc.)

  14. gen_ai.usage.completion_tokens -The number of tokens used for the completion response

  15. gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request

  16. llm.headers -

  17. llm.is_streaming -

  18. llm.request.type - The type of request (e.g. completion, chat, etc.)

  19. llm.usage.total_tokens

Agent Executor

  1. Input : Prompt, user question, agent thoughts and actions

  2. Output: Response to the agent action

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

  3. gen_ai.prompt.0.role

  4. gen_ai.request.max_tokens - The maximum number of response tokens requested

  5. gen_ai.request.model -The model requested (e.g. gpt-4, claude, etc.)

  6. gen_ai.request.temperature

  7. gen_ai.system-he vendor of the LLM (e.g. OpenAI, Anthropic, etc.)

  8. gen_ai.usage.completion_tokens -The number of tokens used for the completion response

  9. gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request

  10. llm.request.type - The type of request (e.g. completion, chat, etc.)

  11. llm.usage.total_tokens - The total number of tokens used

Get LLM Chain Streaming

  1. Input : Prompt with user query

  1. Output: Response from LLM

  1. ServiceName - Information about the application in the resource

  2. SpanName - Internal function name

  3. gen_ai.prompt.0.role

  4. gen_ai.request.max_tokens - The maximum number of response tokens requested

  5. gen_ai.request.model -The model requested (e.g. gpt-4, claude, etc.)

  6. gen_ai.request.temperature

  7. gen_ai.system-he vendor of the LLM (e.g. OpenAI, Anthropic, etc.)

  8. gen_ai.usage.completion_tokens -The number of tokens used for the completion response

  9. gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request

  10. llm.request.type - The type of request (e.g. completion, chat, etc.)

  11. llm.usage.total_tokens - The total number of tokens used

Last updated