Test & Compare
Karini AI's prompt playground allows you to test your prompt against different models and model parameters simultaneously in real-time.
A/B Testing
You can select up to 3 models to test your prompts. The models available for selection must be registered in Karini's Model Hub. You can update the model parameters such as Temperature and Max Tokens to tune the performance of your prompt.
The "Test" button will trigger LLM invocation simultaneously for all the selected models for the prompt. You can review the responses generated by the models in real-time and continue to fine-tune and re-test the prompt as required by tweaking the prompt instructions or modifying the model and parameter configurations. The real-time, side-by-side comparison of prompt responses for various models and parameters gives you the ability to view and analyze the variations in responses generated by each model and guides you select the best combination of model and parameters for your prompt task.
Prompt Responses and Statistics
Upon clicking "Test", you can see real-time response from each of the selected model in the prompt test. If the selected model supports streaming, you will see a streaming response. Once the response generation is complete, you can also review the statistics displayed for each prompt test run. They include:
Input Tokens: Total number of input tokens in the LLM request. This includes the prompt instructions, system prompt, context and user query.
Output Tokens: Total number of output tokens generated by the LLM in response to the prompt request. This number does not exceed the Max Tokens value setup during the prompt testing.
LLM Response Time: The amount of time in milliseconds taken by the LLM to generate complete response for the given prompt request.
Time to First Token: The time that it takes for the model to produce the first token of the response after receiving the prompt. TTFT is particularly relevant for applications utilizing streaming, where providing immediate feedback is crucial.
These statistics provide additional guidance when testing the performance of the prompt and the LLM and can be used to decide if the prompt output is satisfactory or if the prompt needs fine-tuning to obtain more precise results. You can save the prompt experiments by clicking "Save prompt runs" button.
Selecting the Best Answer
Based on the live response and the statistics of the model, you can select the best answer for your prompt requests. Additionally, you can also view the model tracing details to get an in-depth understanding of request execution process by the LLM.
Follow these steps to finalize model selection endpoints and manage prompt runs efficiently.
Configuring Model Endpoints
If only one model is tested, then that model can only be assigned as the Primary Model.
Below are the steps to assign a model as the Primary Model.
Click "Select as best answer" to assign a model as the Primary model.
Once selected, the Select as best answer will turn green.
After selecting the Primary model, you must save the prompt run.
Upon saving, you will be redirected to the Edit page to save the prompt.
After conducting comparative testing with multiple AI models, it is possible to designate a Primary Model and a Fallback Model to optimize response accuracy and reliability. The Primary Model serves as the default AI system, selected based on its performance in terms of accuracy, response time, and cost-effectiveness. The Fallback Model acts as a backup to ensure system robustness, automatically taking over when the primary model fails to generate a response due to latency issues, errors, or unavailability. This dual-model configuration enhances system resilience, minimizes downtime, and ensures uninterrupted workflow in AI-driven decision-making environments.
Below are the steps to assign a model as the Primary Model and Fallback model.
Select appropriate model endpoints as Primary and Fallback based on performance.
Click on “Select as best answer” for the preferred model.
A pop-up window will appear, prompting you to choose between Primary or Fallback assignment.
Selecting Primary changes the "Select as best answer" button to green.
Selecting Fallback changes the button to an orange background.
Once selections are made, save the prompt run to confirm your choices. This action will redirect you to the Edit page to save the prompt. Once selections are made, save the prompt run to confirm your choices. This action will redirect you to the Edit Page to save the prompt.
Refer to the video below for selecting the primary and fallback models in prompts and publishing the prompt.
The models designated as Primary and Fallback will be displayed on the Edit page as shown below.
The prompt can now be saved and published with a relevant tag to ensure proper categorization.
All published versions of the prompt are accessible within the Version Tab. Selecting a specific version displays the corresponding LLM and its configuration details on the right panel of the interface.
Refer to the following video for guidance on interfacing with the Version tab in the UI.
To utilize a previously published version, click the "Load Version" button, enabling further configuration and refinement of the selected prompt version. This streamlined process ensures precise model selection, effective version management, and seamless access to prior configurations for optimized deployment.
Refer to the video for the Load version functionality.
The most recently published version will be displayed in the Prompt Table for reference and management.
Last updated