About Us
Highlights FPT Cloud Server FPT AI Factory FPT Network FPT Cloud Backup & DR FPT Storage FPT Security FPT Container FPT Database FPT Cloud Monitoring FPT Data Suite FPT.AI

Show all

Object Storage

Secure, unlimited storage to ensures efficiency as well as high and continuous data access demand.

GPU Server

Virtual server integration for 3D Rendering, AI or ML

FPT Load Balancing

Enhance application capacity and availability.

FPT AI Factory

Access to an all-inclusive stack for AI development, driven by NVIDIA’s powerful technology!

Cloud WAF

FPT Web Application Firewall provides powerful protection for web applications

Cloud Server

Advanced virtual server with rapid scalability

Backup Service

Backup and restore data instantly, securely and maintain data integrity.

Cloud Server

Advanced virtual server with rapid scalability

FPT AI Factory

Access to an all-inclusive stack for AI development, driven by NVIDIA’s powerful technology!

FPT Load Balancing

Enhance application capacity and availability.

Backup Service

Backup and restore data instantly, securely and maintain data integrity.

Disaster Recovery Service

Recovery, ensuring quick operation for the business after all incidents and disasters.

Block Storage

Diverse throughput and capacity to meet various business workloads.

Object Storage

Secure, unlimited storage to ensures efficiency as well as high and continuous data access demand.

Cloud WAF

FPT Web Application Firewall provides powerful protection for web applications

FPT Cloud WAPPLES

Intelligent and Comprehensive Virtual Web Application Firewall - Security Collaboration between FPT Cloud and Penta Security.

Next-Gen Firewall

The Next generation firewall security service

Container Registry

Easily store, manage, deploy, and secure Container images

Kubernetes Engine

Safe, secure, stable, high-performance Kubernetes platform

FPT Database for MongoDB

Provided as a service to deploy, monitor, backup, restore, and scale MongoDB databases on cloud.

FPT Database for Redis

Provided as a service to deploy, monitor, backup, restore, and scale Redis databases on cloud.

PostgreSQL Database Engine

Provided as a service to deploy, monitor, backup, restore, and scale PostgreSQL databases on cloud.

Monitoring

System Monitoring Solution anywhere, anytime, anyplatform

FPT Data Suite

Helps reduce operational costs by up to 40% compared to traditional BI solutions, while improving efficiency through optimized resource usage and infrastructure scaling.
Pricing
Partner
- Tech news
- White Paper
Event

Service

Cloud Server

FPT AI Factory

FPT Load Balancing

Monitoring

FPT Data Suite

Cloud Insights

ENG

Tiếng Việt English 中文 (中国) 日本語

All documents

Model Fine-Tuning

FPT Monitoring

Incident Management

Billing

AI Factory Billing

Billing

AI Marketplace

AI Inference

AI Studio

FPT AI Inference

AI Inference

AI Infrastructure

FPT Security

FPT Cloud Server

FPT DevSecOps Services

FPT Integration

FPT Database Engine

Managed – FPT Database Engine

FPT Cloud Backup & DR

FPT Storage

FPT Network

FPT Container

Select Dataset

Updated on 05 Nov 2025

Print: Export: PDF

First of all, you must prepare the best dataset because it directly impacts how well the model performs on your intended use case.

Here’s what good dataset quality enables:

Collect examples to target remaining issues.
- If the model still isn't good at certain aspects, add training examples that directly show the model how to do these aspects correctly.
Scrutinize existing examples for issues.
- If your model has grammar, logic, or style issues, check if your data has any of the same issues. For instance, if the model now says "I will schedule this meeting for you" (when it shouldn't), see if existing examples teach the model to say it can do new things that it can't do
Consider the balance and diversity of data.
- If 60% of the assistant responses in the data says "I cannot answer this", but at inference time only 5% of responses should say that, you will likely get an overabundance of refusals.
Make sure your training examples contain all of the information needed for the response.
- If we want the model to compliment a user based on their personal traits and a training example includes assistant compliments for traits not found in the preceding conversation, the model may learn to hallucinate information.
Look at the agreement and consistency in the training examples.
- If multiple people created the training data, it's likely that model performance will be limited by the level of agreement and consistency between people. For instance, in a text extraction task, if people only agreed on 70% of extracted snippets, the model would likely not be able to do better than this.
Make sure all of your training examples are in the same format, as expected for inference.

Alt text

You have two ways to transfer the Training data and Evaluation data:

Upload a file
1. Default value Upload file
2. Choose a local file from your computer.
3. (Optional) Click Download sample to see an example of the expected format.

Notice: Ensure the file matches the selected data format

Trainer	Supported data format	Supported file format	Supported file size
SFT	Alpaca	CSV JSON JSONLINES ZIP PARQUET	Limit 100MB
SFT	ShareGPT	JSON JSONLINES ZIP PARQUET	Limit 100MB
SFT	ShareGPT_Image	ZIP PARQUET	Limit 100MB
DPO	ShareGPT	JSON JSONLINES ZIP PARQUET	Limit 100MB
Pre-training	Corpus	TXT JSON JSONLINES ZIP PARQUET	Limit 100MB

Connect to Data Hub
1. Click Data Hub
2. Select a connection or dataset from the Data Hub. Notice: Ensure the dataset is compatible with the selected format.
3. (Optional) Click Open Data Hub to preview or manage datasets.
4. (Optional) Click Reload icon to update connection and dataset list.
5. Follow the detailed guide Data Hub

Select Dataset Format

Set up Hyperparameters

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months
cookielawinfo-checbox-functional	11 months
cookielawinfo-checbox-others	11 months
cookielawinfo-checkbox-necessary	11 months
cookielawinfo-checkbox-performance	11 months
viewed_cookie_policy	11 months