About Us
Highlights FPT Cloud Server FPT AI Factory FPT Network FPT Cloud Backup & DR FPT Storage FPT Security FPT Container FPT Database FPT Cloud Monitoring FPT Data Suite FPT.AI

Show all

Object Storage

Secure, unlimited storage to ensures efficiency as well as high and continuous data access demand.

GPU Server

Virtual server integration for 3D Rendering, AI or ML

FPT Load Balancing

Enhance application capacity and availability.

FPT AI Factory

Access to an all-inclusive stack for AI development, driven by NVIDIA’s powerful technology!

Cloud WAF

FPT Web Application Firewall provides powerful protection for web applications

Cloud Server

Advanced virtual server with rapid scalability

Backup Service

Backup and restore data instantly, securely and maintain data integrity.

Cloud Server

Advanced virtual server with rapid scalability

FPT AI Factory

Access to an all-inclusive stack for AI development, driven by NVIDIA’s powerful technology!

FPT Load Balancing

Enhance application capacity and availability.

Backup Service

Backup and restore data instantly, securely and maintain data integrity.

Disaster Recovery Service

Recovery, ensuring quick operation for the business after all incidents and disasters.

Block Storage

Diverse throughput and capacity to meet various business workloads.

Object Storage

Secure, unlimited storage to ensures efficiency as well as high and continuous data access demand.

Cloud WAF

FPT Web Application Firewall provides powerful protection for web applications

FPT Cloud WAPPLES

Intelligent and Comprehensive Virtual Web Application Firewall - Security Collaboration between FPT Cloud and Penta Security.

Next-Gen Firewall

The Next generation firewall security service

Container Registry

Easily store, manage, deploy, and secure Container images

Kubernetes Engine

Safe, secure, stable, high-performance Kubernetes platform

FPT Database for MongoDB

Provided as a service to deploy, monitor, backup, restore, and scale MongoDB databases on cloud.

FPT Database for Redis

Provided as a service to deploy, monitor, backup, restore, and scale Redis databases on cloud.

PostgreSQL Database Engine

Provided as a service to deploy, monitor, backup, restore, and scale PostgreSQL databases on cloud.

FPT Data Suite

Helps reduce operational costs by up to 40% compared to traditional BI solutions, while improving efficiency through optimized resource usage and infrastructure scaling.
Pricing
Partner
- Tech news
- White Paper
Event

ENG

LLaMA Factory: A Feature-Rich Toolkit for Accessible LLM Customization

Author: Vũ Tuấn Kiệt

18:21 11/04/2025

Table of Contents

As large language models (LLMs) and vision-language models (VLMs) become increasingly essential in modern AI applications, the ability to fine-tune these models on custom datasets has never been more important. However, for many developers, especially those without a deep background in machine learning, existing frameworks can be overwhelming, requiring heavy coding and complex configurations.

LLaMA Factory is an open-source toolkit designed to make LLM fine-tuning accessible to everyone. Whether you're a beginner, a non-technical professional, or an organization seeking an efficient model customization solution, LLaMA Factory simplifies the entire process with an intuitive web interface and support for dozens of fine-tuning strategies.

In this article, we’ll explore what makes LLaMA Factory stand out, who can benefit from it, and how it compares with other popular frameworks.

Who should use LLaMA Factory?

LLaMA Factory is ideal for:

🧑‍💻 Beginner developers experimenting with LLMs

📊 Data analysts and researchers without ML expertise

🧠 AI enthusiasts working on personal or community projects

🏢 Small teams or startups without ML engineering bandwidth

If you want to fine-tune powerful open-source models like LLaMA or Mistral on your own dataset without writing a line of code, this tool is built for you.

What Does LLaMA Factory Offer?

LLaMA-Factory is an open-source project that provides a comprehensive set of tools and scripts for fine-tuning, serving, and benchmarking LLM and VLM models. It provides a solution for flexibly customizing the fine-tuning of 100+ LLMs and VLMs without the need for coding through the built-in web UI LlamaBoard.

The LLaMA-Factory repository makes it easy to get started with large models by providing:

Scripts for data preprocessing and tokenization tasks

Training pipelines for fine-tuning models.

Inference scripts for generating text with trained models

Benchmarking tools to evaluate model performance

Gradio web UI for interactive testing and training.

LLaMA Factory is designed specifically for beginners and non-technical professionals who want to fine-tune open-source LLMs on their custom datasets without learning complex concepts of AI. Users simply select a model, upload their dataset, and adjust a few parameters to initiate the training process.

Once training is complete, the same web application can be used to test the model before exporting it to Hugging Face or saved locally. This provides a fast and efficient way for fine-tuning LLMs in a local environment.

ss llama
Figure: LLaMA Factory Architecture

Comparing Feature support across LLM training frameworks

Here's how LLaMA Factory stacks up against other popular LLM fine-tuning frameworks like FastChat, LitGPT, and LMFlow:

	Llama Factory	FastChat	LitGPT	LMFlow	Open-Instruct
LoRA	✓	✓	✓	✓	✓
QLoRA	✓	✓	✓	✓	✓
DoRA	✓
LoRA+	✓
PiSSA	✓
GaLore	✓	✓		✓	✓
BAdam	✓
Flash attention	✓	✓	✓	✓	✓
S2 attention	✓
Unsloth	✓		✓
DeepSpeed	✓	✓	✓	✓	✓
SFT	✓	✓	✓	✓	✓
RLHF	✓			✓
DPO	✓				✓
KTO	✓
ORPO	✓

Table: Comparison of features in LlamaFactory with popular frameworks of fine-tuning LLMs

Note: While most frameworks are built on PyTorch and have similar hardware requirements, LLaMA Factory differentiates itself through its ease of use, wide feature support, and strong community. It stands out with extensive support for multiple fine-tuning techniques, including LoRA, QLoRA, DoRA, PiSSA, and more, providing users with flexibility to optimize models based on their specific needs.

Fine-Tuning Techniques Supported

	Freeze-tuning	GaLore	LoRA	DoRA	LoRA+	PiSSA
Mixed precision	✓	✓	✓	✓	✓	✓
Checkpointing	✓	✓	✓	✓	✓	✓
Flash attention	✓	✓	✓	✓	✓	✓
S2 attention	✓	✓	✓	✓	✓	✓
Quantization	✗	✗	✓	✓	✓	✓
Unsloth	✗	✗	✓	✓	✓	✓

Table 2: Compatibility between the fine-tuning techniques featured in Llama Factory.

Quick Overview of Techniques

Freeze-tuning: involves freezing a majority of parameters while fine-tuning the remaining parameters in a small subset of decoder layers.

Gradient low-rank projection: projects gradients into a lower-dimensional space, facilitating full-parameter learning in a memory-efficient manner.

Low-rank adaptation freezes all pre-trained weights and introduces a pair of trainable low-rank matrices to the designated layer.

QLoRA: LoRA combines with quantization to reduce memory usage.

DoRA (Weight-Decomposed Low-Rank Adaptation): breaks down pre-trained weights into magnitude and direction components and updates directional components for enhanced performance.

LoRA+: is proposed to overcome the sub-optimality of LoRA.

PiSSA (Principal Singular Values and Singular Vectors Adaptation) initializes adapters with the principal components of the pre-trained weights for faster convergence.

Quick Start to LLaMA-Factory

1. Installing Dependencies

Workspace and environment can be set up easily by cloning LLaMA Factory repository.


git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git  

cd LLaMA-Factory 

conda create --name llama-factory python=3.11 

conda activate llama-factory 

pip install -e ".[torch,liger-kernel,metrics]" 

pip install deepspeed==0.14.4

2. Preparing Dataset

LLaMA Factory supports multiple data formats for various training methods (e.g., SFT, DPO, RLHF...) as well as sample datasets to help us visualize the structure of a dataset. All data is stored in the /data directory. Users can prepare a customized dataset and update its information in the dataset_info.json file which is also located in the /data directory.

For example, I have a dataset where the image paths, prompt information, and model responses are stored in the file C:/User/annotations.json, and the dataset is structured as follows:

ss 1 net

At this point, you can add the following information to the dataset_info.json file:

3. Finetuning

You can choose to fine-tune via LLaMA Factory's WebUI by running the following command in the terminal:


cd LLaMA-Factory 

GRADIO_SHARE=1 llamafactory-cli webui

The web interface will appear as follows:

thwr

You can adjust the required training configurations, specify the path to the output directory, and click 'Start' to begin the training process.

Additionally, you can fine-tune via the command line by preparing a config.yaml file that includes the required training configurations. You can find examples of different training configurations in the /examples directory.

ss chuan

Then, run the following command to start the training process:

llamafactory-cli train training_config.yaml

4. Merge LoRA

In the case of LoRA training, the adapter weights need to be merged with the original model to obtain the fine-tuned model. This process can be executed via the WebUI by selectingclicking on the 'Export' tab. For the command line, you also need to prepare a configuration file in YAML format, as shown in the following example:

ss nets 5

Then run the following command to merge:


llamafactory-cli export merge_config.yaml

5. Resource monitoring

After setting up the environment successfully, you can check the running process using the 'nvidia-smi' command. Below is an example of training a LLM model using LlamaFactory on H100 node

ss cuoi net

Conclusion

LLaMA Factory is a powerful and user-friendly framework designed to lower the barrier to entry for individuals interested in customizing large language models. It offers a comprehensive suite of state-of-the-art fine-tuning techniques, intuitive UI controls, and compatibility with popular deployment tools while eliminating the need for coding. Whether you're an ML novice or just want a faster way to experiment with LLMs, LLaMA Factory is definitely worth checking out.

Learn more about FPT AI Factory's services HERE.

For more information and consultancy about FPT AI Factory, please contact:

Hotline: 1900 638 399
Email: [email protected]
Support: m.me/fptsmartcloud

Maybe you are interested

What Are AI Agents? Examples, How they work, How to use them.

Vision-Language Models (VLM) Use Cases for Insurance Company on NVIDIA H100 GPUs

Use Cases for Training Large Language Models (LLMs) with Slurm on Metal Cloud

TOP 10 TRENDING TECHNOLOGIES IN 2021

Maybe you are interested

What Are AI Agents? Examples, How they work, How to use them.

14:07 22/07/2025

Vision-Language Models (VLM) Use Cases for Insurance Company on NVIDIA H100 GPUs

21:19 11/04/2025

Use Cases for Training Large Language Models (LLMs) with Slurm on Metal Cloud

14:38 21/04/2025

TOP 10 TRENDING TECHNOLOGIES IN 2021

09:47 05/03/2021

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months
cookielawinfo-checbox-functional	11 months
cookielawinfo-checbox-others	11 months
cookielawinfo-checkbox-necessary	11 months
cookielawinfo-checkbox-performance	11 months
viewed_cookie_policy	11 months