Test Model's Response
Test Model's Response
Updated on 08 May 2025

Select a session to review the model's response. The system will then open a testing playground where you can interact with the model file

1. System Message

An instruction that sets the behavior and tone of the AI model. It helps guide how the model responds throughout the conversation.

You can modify the system message based on your preferences. Default message: "You are a helpful assistant"

Example: You are a cloud computing expert helping users with technical issues

2. User message

The input or question provided by the user to the AI. It serves as the main prompt that the model responds to in a conversation.

Example: What are the benefits of using a private cloud?

We support uploading images in .jpeg and .jpg formats for testing with VLM (Vision-Language Models). List models supporting image testing

  • meta-llama/Llama-3.2-11B-Vision-Instruct
  • Qwen/Qwen2-VL-2B-Instruct
  • Qwen/Qwen2-VL-7B-Instruct
  • Qwen/Qwen2-VL-72B-Instruct
  • Qwen/Qwen2.5-VL-3B-Instruct
  • Qwen/Qwen2.5-VL-7B-Instruct
  • Qwen/Qwen2.5-VL-72B-Instruct
  • google/gemma-3-12b-it
  • google/gemma-3-27b-it
  • google/gemma-3-4b-it

3. Settings:

  • Temperature: Creativity allowed in the responses, typically ranges from 0 to 2. Default: 1
    • Low Temperature (closer to 0):
      • The model generates more predictable, deterministic responses.
      • It favors high-probability words or tokens, producing more focused and precise output.
    • High Temperature (closer to 1 or higher):
      • The model generates more creative, diverse, and unexpected responses.
      • It samples from a wider range of possible words, making the output more varied but potentially less accurate.
  • Advanced settings:
    • Add stop sequence: The stop sequence allows you to control the length and content of the generated text by specifying one or more sequences that, when detected, will cause the model to halt.
    • Output length: Controls the length of the generated text in a model, setting the maximum number of tokens (words or subwords) the model can produce in response to a prompt.. Default: 8192
    • Top-P: Used in generative models to manage the randomness and diversity of generated text, serving as an alternative to temperature in sampling from the model's output distribution. Default: 0.95

Example Model Output with These Settings:

  • Add Stop Sequence: Third-party
  • Output length: 1000
  • Top-P: 0.95 file

→ Model will stop either when it encounters a stop sequence or reaches the maximum token limit.