All documents

Managed – FPT Kubernetes Engine

    How to use GPU in Kubernetes
    How to use GPU in Kubernetes
    Updated on 29 Nov 2024

    FPT Cloud provides Kubernetes with NVIDIA GPUs, offering the following key features:

    • Flexible GPU configuration with multiple GPU types, optional GPU memory, applied to each Worker Group.

    • Automated GPU resource management and allocation in Kubernetes with the NVIDIA Operator.

    • Visualization and monitoring of GPUs using NVIDIA DCGM (Data Center GPU Manager).

    • Automatic scaling of Containers/Nodes with Autoscaler when applications require increased/decreased GPU resources.

    • GPU sharing support with the Multi-Instance mechanism, optimizing GPU resource usage and costs.

    FPT Cloud utilizes the NVIDIA GPU Operator, providing an automated tool for managing all the necessary software components to use GPUs on Kubernetes. The GPU Operator enables users to utilize GPU resources similar to using CPUs in a Kubernetes cluster.

    The components of the Operator include:

    • NVIDIA Drivers (CUDA, MIG, ...)

    • NVIDIA Device Plugin

    • NVIDIA Container Toolkit

    • NVIDIA GPU Feature Discovery

    • NVIDIA Data Center GPU Manager (Monitoring)

    Currently, FPT Cloud supports Kubernetes using NVIDIA A30 GPUs with the following MIG profiles:

    No.  GPU A30 Profile  Strategy  Number instance  Instance resource 
    all-1g.6gb  single  1g.6gb 
    all-2g.12gb  single  2g.12gb 
    all-balanced  mixed  1g.6gb 
      all-balanced mixed 1 2g.12gb 
    none (no label)  none  0 (Entire) 

    Example:

    If the configuration strategy single: all-1g.6gb is selected, the A30 GPU on the worker node is divided into 4 MIG devices with logical GPU resources equivalent to ¼ of the physical GPU and 6GB of GPU RAM each.

    Note:

    • MIG configuration applies to all cards attached to the worker.

    • MIG strategy across worker groups in the same cluster must be of the same type (single/mixed/none).