Select base models
Select base models
Updated on 26 Jun 2025

  • We currently offer base models for fine-tuning, including:
Base Model Description
Llama-3.1-8B Base language model, 8B parameters, versatile, ideal for fine-tuning
Llama-3.2-1B Lightweight base model, 1B parameters, fast, efficient, suitable for edge use
Llama-3.2-8B-Instruct Instruction-tuned model, 8B parameters, optimized for dialogue and tasks
Llama-3.2-11B-Vision-Instruct Multimodal instruction-tuned model, 11B parameters, optimized for vision-language tasks
Llama-3.3-70B-Instruct Instruction-tuned LLaMA model, 70B parameters, excels at complex tasks
Meta-Llama-3-8B-Instruct Instruction-tuned LLaMA model, 8B parameters, optimized for conversational tasks
Qwen2-0.5B-Instruct Small instruction-tuned model, 0.5B parameters, lightweight and task-efficient
Qwen2-VL-7B-Instruct Multimodal instruction-tuned model, 7B parameters, efficient vision-language understanding
Qwen2-VL-72B Multimodal base model, 72B parameters, handles both vision and language
Qwen2-VL-72B-Instruct Multimodal instruction-tuned model, 72B parameters, vision-language understanding and generation
Qwen2.5-0.5B-Instruct Updated instruction-tuned model, 0.5B parameters, improved efficiency and task handling
Qwen2.5-14B-Instruct Instruction-tuned language model, 14B parameters, balanced power and efficiency
Qwen2.5-32B-Instruct Instruction-tuned language model, 32B parameters, strong at understanding tasks
Qwen2.5-VL-72B-Instruct Multimodal instruction-tuned model, 72B parameters, excels at vision-language tasks
Mixtral-8x7B-v0.1 Sparse Mixture-of-Experts model, 8 experts, high efficiency, strong performance
Mixtral-8x22B-v0.1 Large Mixture-of-Experts model, 8×22B experts, scalable, efficient, powerful reasoning
Mixtral-8x22B-Instruct-v0.1 Instruction-tuned MoE model, 8×22B experts, excels at following tasks
DeepSeek-R1 Foundation language model by DeepSeek, versatile, powerful, and open-source
DeepSeek-R1-Distill-Llama-70B Efficient language model, distilled from LLaMA 70B, optimized performance
DeepSeek-R1-V3-0324 Advanced multilingual model, latest DeepSeek version, optimized for diverse tasks

Note: If you want to upload your models, please contact us!