Test Agent Prompt
Agent prompt can be tested using various combinations of LLMs and model parameters, and comparing the responses. Refer to Test & Compare for details about prompt testing.
Following Tracing and Observability features give you good insight into the agent prompt processing when a prompt request is being executed:
Prompt Lens:
Prompt lens let you peek behind the scenes as the agent request is being executed. Here, you can inspect the input we are sending to the language models (LLMs) - including system instructions, context, questions, and response guidelines. For agent prompts, you can also review the following information, as the prompt request is being processed.
Agent scratch pad: The scratch pad aids in refining prompts, documenting interactions, or brainstorming ideas based on the outputs received from the selected models.
Agent response: The agent response refers to the output or action taken by the selected model in response to a user's prompt or query.
Tool response: It gives the insights or summaries related to the tools used, performance metrics, or operational status.
Trace: You can see the traces for each operation that is executed during the prompt processing. It includes the following:
Input:
Output:
Attributes: These include various parameters and metrics associated with each request. Some of the attributes include:
Input Tokens:
Completion tokens:
Model parameters such as temperature, max tokens etc.
The response displays the following statistics as below:
Input Tokens: Total number of input tokens in the LLM request. This includes the prompt instructions, system prompt, context and user query.
Output Tokens: Total number of output tokens generated by the LLM in response to the prompt request. This number does not exceed the Max Tokens value setup during the prompt testing.
LLM Response Time: The amount of time in milliseconds taken by the LLM to generate complete response for the given prompt request.
Time to First Token: The time that it takes for the model to produce the first token of the response after receiving the prompt. TTFT is particularly relevant for applications utilizing streaming, where providing immediate feedback is crucial.
After testing and comparing models, choose the best one. Mark the best model with Select as best answer. The same prompt can be saved with select as best run or Save prompt run option provide.
To see prompt runs in detail refer Prompt Runs section.
Last updated