Blogs Tech

Categories

Blog chia sẻ kiến thức FPT Cloud

What Are AI Agents? Examples, How they work, How to use them.

14:07 22/07/2025
AI Agents are artificial intelligence systems that can interact with the environment and make decisions to achieve goals in the real world without any human guidance or intervention. This technology are shaping technology trends, with notable milestones such as the Google I/O 2023 event launching Astra or the emergence of GPT-4o. Large corporations are pouring billions of dollars into AI Agents to take the lead in AI Era. In this article, FPT.Cloud will clarify how AI Agents are helping businesses improve processes, enhance customer experience and optimize operations. 1. What are AI Agents (Intelligent Agents)? AI Agents are artificial intelligence systems that can interact with the environment and make decisions in the real world without any human guidance or intervention. AI Agents can gather information from their surroundings, design their own workflows, use available tools, coordinate between different systems, and even work with other Agents to achieve goals without requiring user supervision or continuous new instructions. With the development of Generative AI, Natural language processing, Foundation Models, and Large Language Models (LLMs), AI Agents can now simultaneously process multiple types of multimodal information such as text, voice, video, audio, and code. Advanced agent AI can learn and update their behavior over time, continuously experimenting with new solutions to problems until achieving optimal results. Notably, they can detect their own errors and find ways to correct them as they progress. AI Agents can exist in the physical world (robots, autonomous drones, or self-driving cars) or operate within computers and software to complete digital tasks. The aspects, components, and interfaces of each agent AI can vary depending on its specific purpose. Encouragingly, even people without deep technical backgrounds can now build and use AI Agents through user-friendly platforms. 2. What are the key features of an AI Agent platform? Key features of an AI Agent platform include: Autonomy: AI Agents can operate independently, make decisions, and take actions without continuous human supervision. For example, self-driving cars can adjust speed, change lanes, stop, or adjust routes based on real-time sensor data about road conditions and obstacles, without driver intervention. Reasoning Ability: AI agents use logic and analyze available information to draw conclusions and solve problems. They can identify patterns in data, evaluate evidence, and make decisions based on the current context, similar to human thinking processes. Continuous Learning: AI Agents continuously improve their performance over time by learning from data and adapting to changes in the environment. For instance, customer support chatbots can analyze millions of conversations to gain deeper understanding of common issues and improve the quality of proposed solutions. Environmental Observation: AI agents continuously collect and process information from their surroundings through techniques like computer vision, natural language processing, and sensor data analysis. This ability helps them understand the current context and make appropriate decisions. Action Capability: AI agents can perform specific actions to achieve goals. These actions can be physical (like a robot moving objects) or digital (like sending emails, updating data, or triggering automated processes). Strategic Planning: AI agents can develop detailed plans to achieve goals, including identifying necessary steps, evaluating alternatives, and selecting optimal solutions. This ability requires predicting future outcomes and considering potential obstacles. Proactivity and Reactivity: AI agents proactively anticipate and prepare for future changes. For example, Nest Thermostat learns the homeowner’s heating habits and proactively adjusts temperature before the user returns home, while quickly responding to unusual temperature fluctuations. Collaboration Ability: AI agents can work effectively with humans and other agents to achieve common goals. This collaboration requires clear communication, coordinated actions, and understanding the roles and objectives of other participants in the system. Self-Improvement: Advanced AI agents can self-evaluate and improve their operational performance. They analyze the results of previous actions, adjust strategies based on feedback, and continuously enhance their capabilities through machine learning techniques and optimization. Key Features of AI Agents 3. Differences between Agentic AI Chatbots and AI Chatbots Below is a comparison table highlighting the distinctions between Agentic AI chatbots and AI Chatbots: Criteria Agentic AI Chatbots Traditional AI Chatbots Autonomy Operate independently, perform complex tasks without continuous intervention Require continuous guidance from users, only respond when prompted Memory Maintain long-term memory between sessions, remember user interactions and preferences Limited or no memory storage capability, each session typically starts from scratch Tool Integration Use function calls to connect with APIs, databases, and external applications Operate in closed environments with no ability to access external tools or data sources Task Processing Break down complex tasks into subtasks, execute them sequentially to achieve goals Only process simple, individual requests without ability to decompose complex problems Knowledge Sources Combine existing knowledge with new information from external sources (RAG) Rely solely on pre-trained data, unable to update with new information Learning Capability Continuously learn from interactions, improving accuracy and relevance over time Do not learn or improve from user interactions, responses always follow fixed patterns Operation Mode Can perform multiple processing rounds for a single request, creating multi-step workflows Operate on a single-turn basis (receive-process-respond), without multi-step capabilities Planning Ability Strategically plan and self-adjust when encountering new information or obstacles No long-term planning capability or strategy adjustment Personalization Provide personalized experiences based on user history, preferences, and context Deliver generalized responses, identical for all users Response Process Analyze intent, access relevant information, create plan, execute actions, and evaluate results Recognize patterns, search for appropriate responses in existing database, reply Error Handling Recognize errors, self-correct, and find alternative solutions when problems arise Often fail to recognize errors or lack ability to recover when encountering off-script situations User Interaction Proactively ask clarifying questions, suggest options, and track progress Passive, only directly respond to what users explicitly ask Workflow Use threads to store all information, connect with tools, execute function calls when needed Simple processing according to predefined scripts, no workflow extension capability Practical Applications Complex customer support, data analysis, process automation, personal assistance Primarily for FAQs, basic customer support, simple conversations Intent Detection Accurately identify users’ underlying intents, even when not explicitly stated Only react to specific keywords or patterns, often missing true intentions System Integration Easily integrate with multiple systems and applications through APIs Limited integration capabilities, often requiring custom solutions Development Requirements Can be developed on no-code platforms, without requiring in-depth programming knowledge Typically require programming knowledge to build and maintain Agentic AI chatbots mark a significant evolution in conversational AI, powered by LLMs but extending well beyond them. Operating on thread-based architecture, they store complete conversation histories, files, and function call results. These advanced chatbots activate via various triggers (scheduled events, database changes, or manual inputs) to analyze requests, interpret intentions, and execute actions autonomously. Five key innovations drive this technology: RAG integration for context-aware responses with higher accuracy Function calling to interact with external systems Advanced memory systems for continuous learning and adaptation Tool evaluation to assess resources and fill information gaps Subtask generation to break down complex goals independently Unlike traditional chatbots’ single-turn model (receive-process-respond), agentic chatbots process multiple turns per prompt, queue actions strategically, and dynamically select appropriate tools based on user intent. They can search connected knowledge bases, call external APIs, or generate responses from core training when external tools aren’t needed. Critically, no-code platforms have democratized their development, accelerating adoption across industries by enabling businesses of all sizes to implement sophisticated AI without significant technical investment. Differences between Agentic AI chatbots and AI Chatbots 4. Key Components of AI Agents AI Agents are composed of multiple components working together as a unified system, similar to how the human body functions with senses, muscles, and brain. Each component in AI Agent Architecture plays a specific role in helping the agent sense, think, and interact with the surrounding world. Key components of AI Agents 4.1. Sensors Sensors help AI Agents collect information (percepts) from the surrounding environment to understand the context and current situation. In physical robots, sensors might be cameras for “seeing,” microphones for “hearing,” or thermal sensors for “feeling” temperature. For software agents running on computers, sensors might be web search functions to gather online information, or file reading tools to process data from PDF documents, CSV files, or other formats. Sensors help AI Agents collect information (percepts) from the surrounding environment 4.2. Actuators If sensors are how agents receive information, actuators are how they affect the world. Actuators are components that allow agents to perform specific actions after making decisions. In physical robots, actuators might be wheels for movement, mechanical arms for lifting objects, or speakers for producing sound. For software agents, actuators might be the ability to create new files, send emails, control other applications, or modify data in systems. Actuators are components that allow agents to perform specific actions after making decisions 4.3. Brain Processors, Control Systems, and Decision-Making Mechanisms form the “brain” of the AI Agents, where information is processed and decisions are made. Processors analyze raw data from sensors and convert it into meaningful information. Control systems coordinate the agent’s activities, ensuring all parts work harmoniously. Decision-making mechanisms are the most important part, where the agent “thinks” about processed information, evaluates different action options, and selects the most optimal action based on goals and existing knowledge. Processors, Control Systems, and Decision-Making Mechanisms form the “brain” of the AI Agent 4.4. Learning and Knowledge Base Systems These are the memory and learning capabilities of AI Agents, allowing them to improve performance over time. Knowledge base systems store information the agent already knows: data about the world, rules of action, and experiences from previous interactions. This might be a database of locations, events, or problems the agent has encountered along with corresponding solutions. Learning systems allow the agent to learn from experience, recognize patterns, and improve decision-making abilities. An agent with learning capabilities will continuously update its knowledge base, helping it better cope with new situations or changes in the environment. The complexity level of these components depends on the tasks the AI Agent performs. A smart thermostat might only need simple temperature sensors, a basic control system, and actuators to turn heating systems on/off. In contrast, a self-driving car needs to be equipped with all components at high complexity levels: diverse sensors to observe roads and other vehicles, powerful processors to handle large amounts of real-time data, sophisticated decision-making systems for safe navigation, precise actuators to control the vehicle, and continuous learning systems to improve driving capabilities through each experience. AI Knowledge Management Agents 5. How do AI Agents Work? When receiving a command (goal) from a user (Prompt), AI Agents immediately initiate the goal analysis process, transferring the prompt to the core AI model (typically a Large Language Model) and beginning to plan actions. The Agent will break down complex goals into specific tasks and subtasks, with clear priorities and dependencies. For simple tasks, the Agent may skip the planning stage and directly improve responses through an iterative process. During implementation, thanks to Sensors, AI agents collect information (transaction data, customer interaction history) from various sources (including external datasets, web searches, APIs, and even other agents). During this collection process, the AI Agent continuously updates its knowledge base, self-adjusts, and corrects errors if necessary. The Processors of AI Agents use algorithms, Deep Neural Networks, machine learning models, and artificial intelligence to analyze information and calculate necessary actions. Throughout this process, the agent’s Memory continuously stores information (such as history of decisions made or rules learned). Additionally, AI Agents also use feedback from users, feedback from other Agents, and Human-in-the-loop (HITL) to self-compare, adjust, and improve performance over time, avoiding repetition of the same errors. Finally, through Actuators, AI Agents perform actions based on their decisions. For robots, actuators might be parts that help them move or manipulate objects. For software agents, this might be sending information or executing commands on systems. Technically, an AI agent system consists of four main components, simulating the way humans operate To illustrate this process, imagine a user planning their vacation. They ask an AI Agent to predict which week of the coming year will have the best weather for surfing in Greece. Since the large language model that underpins the agent is not specialized in weather forecasting, the agent must access an external database that contains daily weather reports in Greece over the past several years. Even with historical data, the agent cannot yet determine the optimal weather conditions for surfing. Therefore, it must communicate with a surf agent to learn that ideal surfing conditions include high tides, sunny weather, and low or no rainfall. With the newly gathered information, the agent combines and analyzes the data to identify relevant weather patterns. Based on this, it predicts which week of the coming year in Greece is most likely to have high tides, sunny weather, and low rainfall. The final result is then presented to the user. According to BCG analysis, AI agents are strongly penetrating many business processes, with a compound annual growth rate of up to 45% over the next 5 years 6. Common Types of AI Agents There are 5 primary types of AI Agents: Simple Reflex Agents,  Goal-Based AI Agents, Model-Based Reflex Agents, Utility-Based Agents, Learning Agents. Each suited to specific tasks and applications: Simple Reflex Agents: Simple Reflex Agents operate on the “condition-action” principle and respond to their environment based on simple pre-programmed rules, such as a thermostat that turns on the heating system at exactly 8pm every night. The agent does not retain any memory, does not interact with other agents without information, and cannot react appropriately if faced with unexpected situations. Model-Based Reflex Agents: Model-Based Reflex Agents use their cognitive abilities and memory to create an internal model of the world around them. By storing information in memory, these agents can operate effectively in changing environments but are still constrained by pre-programmed rules. For example, a robot vacuum cleaner can sense obstacles when cleaning a room and adjust its path to avoid collisions. It also remembers areas it has cleaned to avoid unnecessary repetition. Goal-Based AI Agents: Goal-Based Agents are driven by one or more specific goals. They look for appropriate courses of action to achieve the goal and plan ahead before executing them. For example, when a navigation system suggests the fastest route to your destination, it analyzes different paths to find the most optimal one. If the system detects a faster route, it updates and suggests an alternative route. Utility-Based Agents: Utility-Based Agents evaluate the outcomes of decisions in situations with multiple viable paths. They employ utility functions to measure the usefulness that each action might bring. Evaluation criteria typically include progress toward goals, time requirements, or implementation complexity. This evaluation system helps identify the ideal choice: Is the best option the cheapest? The fastest? The most efficient? For example, a navigation system considers factors such as fuel economy, reduced travel time, and toll costs to select and recommend the most favorable route for the user. Learning Agents: Learning Agents learn through concepts and sensors, while utilizing feedback from the environment or users to improve performance over time. New experiences are automatically added to the Learning Agent’s initial knowledge base, helping the agent operate effectively in unfamiliar environments. For example, e-commerce websites use Learning Agents to track user activity and preferences, then recommend suitable products and services. The learning cycle repeats each time new recommendations are made, and user activities are continuously stored for learning purposes, helping Agents improve the accuracy of their suggestions over time. Popular Types of AI Agents 7. What are the outstanding benefits of using AI Agents? AI Agents for businesses deliver a consistent experience to customers across multiple channels, with the following 4 outstanding benefits: Improve productivity: AI Agents help automate repetitive and time-intensive tasks, freeing up human resources from manual work so that businesses can focus on more strategic, creative and high-value initiatives, fostering innovation. For more complex issues, AI Agents can intelligently escalate cases to human agents. This seamless collaboration ensures smooth operations, even during periods of high demand. Reduce costs: By optimizing processes and minimizing human errors, AI personnel help businesses cut operating costs. Complex tasks are handled efficiently by AI Agents without the need for constant human intervention. Make informed decisions: AI Agents use machine learning (ML) technologies to help managers collect and analyze data (product demand or market trends) in real time, making faster and more accurate decisions. Improve customer experience: AI agents significantly enhance customer satisfaction and loyalty by offering round-the-clock support and personalized interactions. Their prompt and precise responses effectively address customer needs, ensuring a smooth and engaging service experience. Lenovo leveraged AI agents to streamline product configuration and customer service, integrating them into key systems like inventory tracking. By building a knowledge database from purchase data, product details, and customer profiles, AI agents help Lenovo cut setup time from 12 minutes to 2 minutes, boosting sales productivity and customer experience. This led to a 12% improvement in order delivery KPIs (within 17 days) and generated $5.88 million in one year, according to Gartner. Benefits of implementing AI Agents in Business 8. Is ChatGPT an AI Agent? ChatGPT is not an AI Agent. It is a large language model (LLM) designed to generate human-like responses based on received input, with some components similar to AI Agents: Simple sensors that receive text input Actuators that generate text, images, or audio Control system based on transformer architecture Knowledge base system from pre-training data and fine-tuning. However, these elements are not sufficient to make ChatGPT a genuine Agent. The most important difference between AI Agents and ChatGPT is autonomy. ChatGPT cannot set its own goals, make plans, or take independent actions. When you ask ChatGPT to write an email, it can create content but cannot send the email itself or evaluate whether sending an email is the best action in a specific situation. Additionally, ChatGPT cannot directly interact with external systems or adjust its behavior based on real-time feedback. Updates like plugins, extended frameworks, APIs, and prompt engineering can improve ChatGPT’s functionality, but still don’t create a complete Agent. ChatGPT also lacks the ability to maintain long-term memory between sessions. It doesn’t “remember” you or previous conversations unless specifically programmed to do so in certain applications. ChatGPT lacks core features to be considered an AI Agent 9. Practical Applications of AI Agents Imagine a future workplace where every employee, manager, and leader not only works together, but is also equipped with a team of AI teammates to support them in every task and at every moment of the workday. With these AI teammates, we will become 10x more productive, achieve better results, create higher quality products, and of course, become 10x more creative. You may be wondering, “When will this future come?” The answer from FPT is: The future is now. Here are four stories that demonstrate how AI is already impacting businesses. 9.1. Revolutionizing Insurance Claims Processing Imagine you go to the hospital for a health check-up, buy medicine, and file an insurance claim. Typically, the insurance company’s document processing will take at least 20 minutes. With integrated AI Agents, insurers can process all documents through rapid assessment tools, risk assessment tools, and fraud detection tools, returning results in just 2 minutes. This represents an incredible leap in productivity, improving the customer experience and creating new competitive value for the business. AI Agents in Finance – Accounting 9.2. Transforming the Customer Contact Center The second story focuses on customer service. Several FPT.AI customers have deployed AI systems for inbound and outbound communications. These systems provide human-like customer support, handling requests, resolving issues, and providing excellent service. For some customers, AI Agents are now handling 70% of customer requests, completing 95% of received tasks, and achieving a customer satisfaction rating of 4.5/5. Currently, FPT’s customer service AI Agents manage 200 million user interactions per month. How AI Agents Improve Customer Service Advantages of applying AI Agents in customer service 9.3. Empowering pharmacists with AI Mentor At Long Chau, the largest pharmacy chain in Vietnam, more than 14,000 pharmacists work every day to advise customers. To ensure they stay updated with knowledge and work effectively, FPT.AI has developed an AI Mentor that interacts with more than 16,000 pharmacists across 2,000 pharmacies every day. This AI Mentor identifies strengths and weaknesses, provides insights, and personalizes conversations to help them improve. The results are: Pharmacists’ competencies improved by 15%. Productivity increased by 30%. Within the first nine months of the year, the pharmacy chain recorded a revenue growth of 62%, reaching VND 18.006 trillion, accounting for 62% of FRT’s total revenue and completing 85% of its 2024 plan. More importantly, we pride ourselves on helping pharmacists become the best versions of themselves while continuously improving. FPT AI Mentor won the “Outstanding Artificial Intelligence Solution” award at AI Awards 2024 9.4. From a cost center to a profit center FPT.AI’s AI Innovation Lab works with customers to identify opportunities, deploy pilots, and scale solutions. For example, one of our clients transformed their customer service center from a cost center to a profit center. Using AI, they detected when customers were happy and immediately suggested appropriate products or services to upsell credit cards, cross-sell overdrafts, activate new customers to sign up, and reactivate existing customers. This approach helped the customer service center contribute about 6% of total revenue. The four stories above are just a small part of the countless ways AI can transform businesses. AI, as a new competitive factor, is opening up a blue ocean of innovation. Every company and organization will need to reinvent their operations and build a strong foundation to compete in the future, leveraging the advances of AI. Applications of AI Agents in practice 10. Challenges in Deploying AI Agents AI Agents are still in their early stages of development and face many major challenges. According to Kanjun Qiu, CEO and founder of AI research startup Imbue, the development of AI Agents today can be compared to the race to develop self-driving cars 10 years ago. Although AI Agents can perform many tasks, they are still not reliable enough and cannot operate completely autonomously. One of the biggest problems that AI Agents face is the limitation of logical thinking. According to Qiu, although AI programming tools can generate code, they often write wrong or cannot test their own code. This requires constant human intervention to perfect the process. Dr. Fan also commented that at present, we have not achieved an AI Agent that can fully automate daily repetitive tasks. The system still has the ability to “go crazy” and not always follow the exact user request. Challenges and Considerations When Using AI Agents Another major limitation is the context window – the ability of AI models to read, understand, and process large amounts of data. Dr. Fan explains that models like ChatGPT can be programmed, but have difficulty processing long and complex code, while humans can easily follow hundreds of lines of code without difficulty. Companies like Google have had to improve the ability to handle context in their AI models, such as with the Gemini model, to improve performance and accuracy. For “physical” AI Agents such as robots or virtual characters in games, training them to perform human-like tasks is also a challenge. Currently, training data for these systems is very limited and research is just beginning to explore how to apply generative AI to automation. 11. Continue writing the future with AI Agents with FPT.AI In the digital economy, competition between companies and countries is no longer based solely on core resources, technology and expertise. Organizations, from now on, will need to compete with a new important factor: AI Companions or AI Agents. It is expected that by the end of 2025, there will be about 100,000 AI Agents accompanying businesses in customer care, operations and production. Each AI Agent will undertake a number of tasks such as programming, training, customer care… Thanks to that, employees are more empowered, businesses increase operational productivity, improve customer experience, and make more accurate decisions based on data analysis. The Future of AI Agents FPT AI Agents – a platform that allows businesses to develop, build and operate AI Agents in the simplest, most convenient and fastest way. The main advantages of FPT AI Agents include: Easy to operate and use natural language. Flexible integration with enterprise knowledge sources. AI models are optimized for each task and language. Currently, FPT AI Agents supports 4 languages: English, Vietnamese, Japanese and Indonesian. In particular, AI Agents have the ability to self-learn and improve over time. FPT AI Agents is FPT Smart Cloud’s trump card in the AI ​​era AI Agents are all operated on FPT AI Factory – an ecosystem established with the mission of empowering every organization and individual to build their own AI solutions, using their data, supplementing their knowledge and adapting to their culture. This differentiation fosters a completely new competitive edge among enterprises and extends to building AI sovereignty among nations. FPT AI Agents Deployment Process With more than 80 cloud services and 20 AI products, FPT AI Factory helps accelerate AI applications by 9 times thanks to the use of the latest generation GPUs, such as H100 and H200, while saving up to 45% in costs. These factories are fully compatible with the NVIDIA AI Enterprise platform and architectural blueprints, ensuring seamless integration and operation. 12. FAQs about AI Agents 12.1. What’s the difference between LLMs and AI Agents? LLM (Large Language Model) is an AI model trained on a vast amount of data to recognize and generate natural language. It functions as a “language brain,” predicting each next word in a sentence. However, traditional LLMs are limited to their initial training data, lack the ability to interact with the outside world, and cannot update themselves with new information after training. AI Agent, on the other hand, is a much more complete system that typically uses an LLM as its core intelligence foundation but is supplemented with sensors (gathering information), actuators (performing actions), knowledge bases (storing knowledge), and control systems (making decisions). This structure allows AI Agents to not only understand language but also interact with the surrounding environment. The decisive difference between LLMs and Agents lies in the AI Agent’s “tool calling” capability. Through this mechanism, an Agent can retrieve updated information from external sources, optimize workflows, automatically break down and solve subtasks, store interactions in long-term memory, and plan for future actions. These capabilities help AI Agents provide more personalized experiences and comprehensive responses, while expanding their practical application across many fields. Difference among LLM, RAG and AI Agent 12.2. Are Reasoning models (like OpenAI o3 and DeepSeek R1) AI agents? No. Reasoning models like o3 and R1 are LLMs trained to reason through solutions to complex problems. They do this by breaking them down into multiple steps using chain of thought. These LLMs cannot naturally interact with other systems or extend their reasoning beyond architectural limitations. 12.3. How do AI Agents integrate with existing systems and workflows? The most common ways to integrate AI agents are: Connect it to a RAG platform, providing native connections between LLMs and knowledge bases. This allows the agent to use representations of your documents and business data as context for future responses, increasing output accuracy. Through APIs to external services. When you configure function calls in the AI agent platform, the model interacts with API endpoints in the same way a traditional program would, creating all the headers and body of the call. AI Agents workflow 12.4. How does Human-in-the-loop fit into the AI Agent workflow? Human-in-the-loop frameworks enhance supervision of AI agent systems. Simply put, the agent’s actions pause at predetermined points in the workflow. A notification is sent to the user, who must review decisions, information, and scheduled tasks. Based on this information, the user will approve or change how the AI agent will continue the task. 12.5. Will AI agents take our jobs? This technology will certainly replace jobs and bring changes to the market, although there is no clear vision of when and how this might happen. Workers may be replaced by AI agents in many industries. At the same time, many positions for AI development and maintenance may be created, along with human-in-the-loop positions, to ensure that human decisions control AI actions rather than the other way around. 12.6. Do AI agents exacerbate bias and discrimination? An AI model is only as unbiased as the data it was trained on—so yes, they are biased. Addressing these issues involves changing machine learning processes and creating datasets that represent the full spectrum of the world and human experiences. AI Agents with Human-in-the-loop 12.7. Who is responsible when an AI Agents makes a mistake? A difficult problem in ethics and law, it’s still unclear who should be blamed for accidents and unintended consequences. The developers? Hardware/software owners? Operators? As new laws are created and industry barriers are implemented, we will be able to understand what roles AI agents can—and cannot—assume. In short, with the ability to be autonomous, operate independently, make decisions based on data and real-world environments, AI Agents are a powerful automation solution that helps businesses optimize processes. The AI ​​Agents market is forecast to reach a size of 30 billion USD by 2033 and maintain a growth rate of about 31% per year. The explosive potential of this technology in the next decade is huge. Contact FPT.AI now to take advantage of the enormous power of AI colleagues, accelerate innovation, enhance customer experience and scale more efficiently than ever!

Integrating FPT AI Marketplace API Key into Cursor IDE for Accelerated Code Generation

11:38 18/07/2025
In the AI era, leveraging large language models (LLMs) to enhance programming productivity is becoming increasingly common. Instead of relying on expensive international services, developers in Vietnam now have access to FPT AI Marketplace — a domestic AI inference platform offering competitive pricing, high stability, and superior data locality.  This article provides a step-by-step guide to integrating FPT AI Marketplace’s model API into Cursor IDE, enabling you to utilize powerful code generation models directly within your development environment. 1. Create an FPT AI Marketplace Account Visit https://marketplace.fptcloud.com/ and register for an account.  Special Offer: New users will receive $1 in free credits to experience AI Inference services on the platform! 2. Browse the List of Available Models After logging in, you can view the available models on FPT AI Marketplace.  Figure 1: List of available models on FPT AI Marketplace  For optimal code generation results, it is recommended to select models such as Qwen-32B Coder, LLaMA-8B, or DeepSeek. 3. Generate an API Key Please log in and navigate to https://marketplace.fptcloud.com/en/my-account#my-api-key      Click “Create new API Key”, select the desired models, enter a name for your API key, and then click “Create”.  Figure 2: API Key creation interface  Verify the information and retrieve your newly generated API Key.  Figure 3: API Key successfully created 4. Configure Cursor IDE with FPT AI Marketplace API Steps to configure:  1. Open Cursor IDE → go to Cursor Settings → select Models.  2. Add Model:  a. Click Add model  b. Add the model (e.g., qwen_coder, deepseek_r1).  3. Enter API Key:  a. In the OpenAI API Key field, paste the API key you generated from FPT AI Marketplace.  4. Configure FPT AI URL:  a. Enable Override OpenAI Base URL  b. Enter the following URL: https://mkp-api.fptcloud.com  Figure 4: Configuring API Key and Base URL in Cursor IDE  5. Confirmation:  a. Click the Verify button.  b. If Verified Successfully appears, you are now ready to start using the model! 5. Using Code Generation Models in Cursor You can now: Use the AI Assistant directly within the IDE to generate code.  Ask the AI to refactor, optimize, or explain your existing code.  Select the model you wish to use.  Figure 5: Using the Llama-3.3-70B-Instruction model from FPT AI Marketplace to refactor code 6. Monitor Token Usage To manage your usage and costs:  Go to My Usage on FPT AI Marketplace.  View the number of requests, input/output tokens, and total usage.  This allows you to see how many tokens you have used, helping you better control and manage your costs.  Conclusion  With just a few simple steps, you can harness the full power of the FPT AI Marketplace. You’ll be able to leverage advanced AI models at a cost-effective rate, accelerate your workflow with fast code generation, intelligent code reviews, performance optimization, and automated debugging. At the same time, you can easily monitor and manage your usage with clarity and transparency. 

FPT AI Factory – $200 Million Bold Bet on Sovereign AI

15:36 16/07/2025
From infrastructure, core products to local talent, FPT AI Factory is laying the first bricks toward Vietnam’s long-term ambition of achieving artificial intelligence (AI) sovereignty. In April 2024, a simple phrase lit up under the white lights of FPT’s main auditorium in Hanoi: “AI Factory – Make-in-Vietnam”, next to the Nvidia logo. The message was subtle, yet it carried the pride and great aspirations of Vietnamese engineers. In an era where data is the “new oil” and algorithms dictate competitive power, a Vietnamese corporation investing in AI with infrastructure located in Vietnam, developed by Vietnamese engineers, and focused on meeting the needs of Vietnamese users is no longer just a company’s internal strategy. It is a concrete indication that Vietnam is stepping up to assert its position on the global AI map, not just as a consumer of imported technology, but as a creator of its own. Decision No. 1131/QĐ-TTg is a strategic push that places AI among the 11 national key technologies. For FPT, it’s a much-needed boost for a dream that has been nurtured for years: mastering core technologies. FPT AI Factory - Vietnam’s first AI factory - is turning that vision into reality: a place where data, algorithms, talent, and ambition come together, gradually transforming Vietnamese intelligence from an idea into strategic national products. With AI Factory, FPT did not take the safe pathway, but chose AI as a “strategic bet” - investing directly in core capabilities, building compute infrastructure, and developing an open ecosystem not just to serve itself, but to empower the entire business community. From 5-person startups to enterprises with thousands of employees, anyone can access the platform and create their own AI solutions. With the $200 million investment plan announced in April 2024, FPT is not building another traditional data center, but a factory of intelligence - one that doesn’t produce manufacture hardware or commercial software, but generates the true assets of the future digital economy: computing power, language models, and intelligent agents (AI Agents). FPT’s “Build Your Own AI” strategy is more than just a tech slogan or a product roadmap. It is a strategic move to narrow the gap with global digital powerhouses. Rather than replicating generic AI models, FPT is building a platform that empowers local organizations to develop AI tailored to their language, user behavior, and local business systems. This is not just a technological leap, but a strategic choice with multiple layers of meaning. It is all about making AI accessible - not only for engineers, but for those who “may not live by algorithms but live with systems,” from logistics managers to customer service operators. FPT AI Factory is beyond a tech initiative. It embodies a concept that is gaining global relevance: sovereign AI, where computing power is no longer dependent on external forces, data is not silently exported, and creative potential is no longer trapped in pre-packaged, imported “black boxes” with limited adaptability. AI Factory - The Infrastructure for the Era of Mass AI  In 2023, an engineer named Pieter Levels built an AI startup from a coffee shop in Bali. This one-man model then generated an annual revenue of $1 million. Another anecdote that caught widespread attention tells of a programmer who used GPT-4 to create an entire interactive adventure game within a few hours… while waiting for his wife to give birth in the hospital. With the launch of ChatGPT, AI has broken free from the inertia of linear growth to an era of exponential breakthroughs. In just five years, AI has evolved from a specialized concept into a foundational tool as ChatGPT writes emails, Midjourney creates visuals that even professional artists admire, and Copilot becomes a companion for developers who half-jokingly call it “organized laziness.” Many AI-embedded applications have been steadily making their way into every enterprise software, collaborative platform, and creative tool, enabling people to make faster, more accurate decisions and complete their work more efficiently. The presence of AI does not come with a bang; it spreads quietly through millions of small clicks each day. That is when someone edits a document, schedules a meeting, analyzes a customer report… all with AI silently running in the background. What once required a team of engineers now fits neatly inside a browser. This shift is more than a sign of a tech trend - it’s a signal that AI has become part of the digital infrastructure, weaving its way into every corner of daily life.  In this context, no-code/low-code AI platforms, which allow non-programmers to build intelligent solutions, are reshaping the full picture of digital transformation. According to Mordor Intelligence, this market is expected to reach $8.89 billion by 2030, with a compound annual growth rate (CAGR) of 17.03%. At the same time, the concept of AI Agents - autonomous agents capable of independently executing tasks within an organization - is gaining just as much traction, with a market value projection of $52.62 billion by 2030 (According to MarketsandMarkets). The rise of these platforms means that AI is no longer a resource for researchers or Big Tech; it is now a tool that can be deployed at any scale. With just a few clicks, a non-expert can create a customer service chatbot, a data classification system, or even a personal digital assistant - those once necessitated an entire team of engineers.  Yet, this democratization brings new demands for infrastructure. Large language models like DeepSeek-R1 with 671 billion parameters would require immense computational resources to run effectively if the architecture is not optimized. By using the Mixture of Experts (MoE), which activates only a portion of the model during each inference, DeepSeek-R1 has decreased computing costs by up to 94% while maintaining high accuracy. This is where the AI Factory model becomes essential for a production line for AI capabilities, where every step from data collection, training, fine-tuning, deployment, and operation is optimized like an industrial process. The key differentiation is not hardware, but in the guiding philosophy of developing localized AI models that understand not only the language but also behaviors, customs, and organizational dynamics of each market.  In Vietnam, those models can comprehend Vietnamese, including unique syntax and contextual nuances, to effectively support public administration, customer service, or local financial operations. But this approach extends beyond the border of the S-shaped country. In Japan, FPT is partnering with Sumitomo and SBI Holdings to build a second AI factory, aiming to support and collaborate with local players in developing “sovereign AI” capabilities, customized to the social fabric and strict standards of the Japanese market. That means, each time FPT enters a new country or market, it does not apply a one-size-fits-all model. Instead, the corporation proactively approaches and localizes from data to product design. With a presence in over 30 countries and territories, FPT has accumulated a wealth of local insights, allowing its AI solutions to truly fit the needs of local users, exactly in line with the “Build Your Own AI” strategy. Globally, this shift is already well underway. For instance, the Japanese government has invested $740 million to build a domestic AI factory in collaboration with Nvidia, striving to ensure infrastructure independence (Nvidia Blog). Meanwhile, the European Union plans to fund 20 large-scale AI factories between 2025 and 2026 (Digital Strategy EU). The telecom giant Softbank has also committed over $960 million to developing its own domestic AI infrastructure. Amid this global momentum, Vietnam needs an adequate approach. We may not be able to compete in terms of scale with the U.S. or China, but we can differentiate ourselves by building infrastructure that is lean, agile, and attuned to local users.  FPT has made the first bold move by investing in core capabilities, developing platforms for Vietnamese businesses, and contributing to a future where AI is not an option, but a prerequisite for economic growth and national competitiveness.   From Strategic Vision to a Sovereign AI Ecosystem Beyond demonstrating its internal capabilities, FPT’s partnership with Nvidia indicates a forward-looking commitment to a direction that many countries have yet to fully pursue. Departing from the conventional path, FPT AI Factory is designed to produce AI capabilities, from data collection and model training to inference and deployment.  In just one year, this approach has been brought to life through high-performance GPU infrastructure, no-code/low-code platforms, and an open ecosystem that enables businesses to access AI as a service - no need for hardware investment, no dedicated technical team - yet still capable of delivering intelligent tools that meet real-world needs.  This is not AI for show, but a deployable capability at scale, aligned with the “make-in-Vietnam” spirit, spanning from computing power to language models and localized solutions.  An HR specialist can build a chatbot to answer questions about company policies, a logistics manager can train a model to analyze inventory risks, and a customer service representative can use AI to categorize and respond to emails. They do not need to know how to write a loss function or select an optimizer - the kinds of technical barriers that have long kept most people from accessing AI, but practical problems and real data. One of the defining features of FPT’s AI platform is its deep adaptability to local contexts, including language, user behavior, and operational structures. Models are trained on formal datasets specific to each target market, enabling them to accurately understand linguistic nuances, workflows, and industry characteristics. Instead of requiring businesses to overhaul their core systems, AI services can be flexibly integrated with existing infrastructure, from CRM and ERP to internal platforms, allowing AI to operate as a natural extension of an organization’s current technology ecosystem. This strategy helps safeguard data, shorten deployment time, and maintain on-site technological control - core elements for building sovereign AI capabilities that transcend the boundaries of a single nation. With an open architecture and a localization-first mindset, this model can be quickly replicated across international markets with similar needs. The capabilities of FPT AI Factory have also been recognized through global benchmarks. On the TOP500 list of the world’s most powerful supercomputers (LINPACK benchmark), FPT’s systems currently rank 38th in Vietnam and 36th in Japan - a solid foundation for entering performance-intensive markets. The stature of an AI factory is also reflected in its ability to attract global partners. A standout example is LandingAI - a company founded by Andrew Ng, well known for its enterprise Visual AI platform. As it expanded to new markets, LandingAI chose to deploy its image inference tasks on the Metal Cloud from FPT AI Factory. Thus, they significantly lower costs and reduce model deployment times from several weeks to just a few days - a critical advantage for businesses pursuing rapid growth. A single year may not seem long in the tech industry, but in the world of AI, where change happens exponentially, it is enough time to chart a path forward. While much of the world is still finding its footing, FPT made an early bet on practical AI infrastructure, not just to keep pace with global trends, but to deliver real value quickly, certainly, and strategically: building AI in a way that fits local markets and opens the door to autonomy in what is widely seen as the “power game” of the 21st century. “Build Your Own AI” Strategy – FPT Opens Infrastructure, Enterprises Take Control of the Digital Future FPT’s announcement to build 5 AI factories globally by 2030 is not a publicity stunt, but a firm commitment to go the distance in the technology race: investing in robust AI infrastructure while simultaneously developing a ready, capable AI workforce as a core competency. On the “hard” aspect, FPT currently operates one of the region’s most powerful AI computing systems, partners with global leaders like Nvidia, and continues to expand its capabilities in both Vietnam and Japan. But as Dr. Truong Gia Binh, FPT Chairman, shared, “The world is facing a serious talent shortage, and that is where FPT holds an advantage.” Therefore, the “soft” component of this strategy focuses on education and widespread AI literacy, preparing students, engineers, and operational experts across industries with the skills to use and integrate AI at scale. FPT perceives this as a journey that fuses the computational power of machines with the thinking and adaptability of humans to craft breakthrough solutions. “As AI becomes more accessible and democratized, the greater the demand will be for AI factories. We will not stop at just two factories because many global corporations have already approached us to collaborate,” said Mr. Trương Gia Bình. Starting from infrastructure and data, while the world races to train models with trillions of parameters, even a 32-billion-parameter model requires around 400 billion tokens to train. For comparison, a human processes only about 50–100 million language tokens in a lifetime. As Vietnam’s digital data remains fragmented and lacks transparent sharing mechanisms, it presents a significant challenge for building localized AI capabilities. The solution stems from the unique data Vietnam already possesses: healthcare, education, and insurance. Once these sectors are fully digitized, we can not only build models that reflect local realities but also proactively improve public health and optimize public services. Data sovereignty, then, is not just a slogan. It is a national strategic imperative. It requires robust digital infrastructure, transparent data-sharing mechanisms, and a long-term commitment to building AI platforms that serve the interests of the Vietnamese people. According to Mr. Le Hong Viet, CEO of FPT Smart Cloud, FPT’s decision to invest directly in Vietnam is about more than building high-performance computing infrastructure. It is driven to address a key bottleneck in the domestic market and move AI out of experimental labs into practical use for every business. ”AI Factory was created to make AI accessible to everyone, from large enterprises to small startups and research institutions across Vietnam”, said Mr. Viet.  For FPT, he added, AI is not just a tool for boosting internal productivity or delivering value to clients. It is a strategic arena for establishing Vietnam’s position on the global technology map. The AI-first mindset has been embedded in FPT’s DNA from the beginning and is now serving as the guiding principle for its next stage of growth. Infrastructure is only the starting point: the real challenge revolves around technology transfer. This is why FPT is pursuing a parallel investment strategy: bringing the most advanced Nvidia chips to Vietnam, developing tools that help businesses fine-tune their own models at low cost, and training a workforce for both operation and deep research. Mr. Viet further stated that fine-tuning an AI model used to cost tens of thousands of dollars, but now, with FPT’s services, businesses can sink down that cost to a few hundred dollars or even adopt a “pay-as-you-go” model, paying only for what they use. This grants access to cutting-edge technologies for small businesses, universities, and research institutions - those that are often left behind in the global AI surge.  At the same time, FPT is not hiding its ambition to expand the AI Factory model internationally, starting with Japan, one of the world’s most digitally advanced economies, yet still lacking compatible specialized AI infrastructure. Following Japan, FPT aims to accelerate expansion into markets such as Malaysia, South Korea, and Europe. “We are entering the Japanese market to stay ahead of the wave of AI adoption. FPT will operate as a truly Japanese company to promote the development of sovereign AI in Japan,” Mr. Viet said. ​​With the involvement of partners like Sumitomo and SBI Holdings, the AI Factory in Japan will not be a replica of the Vietnamese model, but a fully localized entity, from organizational structure to integrated ecosystem. At the strategic level, Mr. Le Hong Viet put it plainly: “The strategy of promoting sovereign AI will enhance Vietnam’s national competitiveness by amplifying productivity and accelerating automation.” In a country like Vietnam, where data is becoming a core asset, relying on foreign platforms means losing control over the value of such data. Sovereign AI, therefore, is not a theoretical ideal; it is a form of digital self-defense. It ensures that every piece of data is processed, stored, and operated on platforms developed by Vietnamese people, for the benefit of Vietnamese people. However, without a skilled workforce ready, even the most advanced technology will remain just infrastructure, and FPT understands this clearly. “Training talent at all levels is the most critical factor in the intelligent transformation journey,” Mr. Viet emphasized. FPT AI Factory is a bold challenge and, simultaneously, a playground to attract talent, develop a highly prepared AI workforce, and embed an AI-first mindset across the organization. From business leaders (Level 1), to employees who use AI as a tool (Level 2), and further to training for AI engineers (Level 3), FPT has set the goal of training 500,000 AI-capable professionals over the next 5 years. From infrastructure to human capital, from domestic challenges to the global landscape, FPT is not standing on the sidelines of the AI evolution. It has chosen to invest, to deploy, and to let real-world results speak louder than promises. “What sets FPT apart is that we do not build AI for show - we build it to deploy. And we use this very platform in our own operations,” Mr. Viet affirmed. At a time when any AI model can be technically replicated, the ability to make those models actually work, in context, for real users is the long-term edge FPT is working to define. One year after the launch of FPT AI Factory, FPT is laying the foundation for a future where AI is not just a business advantage but a driver of national competitiveness. The journey may be long, but through each and every step, FPT is making one thing clear: Vietnam can be technologically self-reliant and ready to seize opportunity at critical turning points.

FPT’s Dual AI Factories Named TOP500 World’s Fastest Supercomputers

18:11 24/06/2025
Two AI Factories developed by FPT Corporation have been listed in the latest global supercomputer ranking TOP500, affirming FPT’s world-class capabilities in artificial intelligence (AI) and cloud computing. The TOP500 is the world’s most recognized ranking of high-performance computing systems, based on the LINPACK benchmark, which measures the number of complex floating-point operations per second (FLOPS) that a system can perform. Established in 1993 by three renowned HPC researchers and published twice a year, the TOP500 List is widely recognized by governments, scientific institutions, and enterprises as the global standard for not only advanced hardware capabilities but also excellence in system design, optimization, and the ability to power complex AI and scientific workloads at scale. In the June 2025 edition of the TOP500, FPT's AI factories, located in Japan and Vietnam, are placed No. 36 and No. 38, respectively. This ranking positions them among the world's top supercomputing infrastructures and recognizes FPT as the No.1 commercial AI cloud provider in Japan, offering NVIDIA H200 Tensor Core GPUs (SXM5). [caption id="attachment_63304" align="aligncenter" width="800"] FPT's AI Factories ranked 36th and 38th in the TOP500 List (Source: TOP500.org)[/caption] The Japan-based AI factory, boasting 146,304 cores, achieved a remarkable performance of 49.85 PFLOPS. Meanwhile, AI Factory in Vietnam attained 46.65 PFLOPS with 142,240 cores. Both AI factories employ InfiniBand NDR400, enabling seamless scaling from a single GPU to clusters of over a hundred nodes in each region, delivering consistently high performance and low latency for large-scale AI and HPC workloads. The inclusion in the 65th edition of the prestigious TOP500 list recognizes FPT’s AI factories globally for their exceptional computing power, engineering expertise, and service quality, demonstrating their readiness to meet the global demand for AI research, development, and deployment. This achievement also positions Vietnam among the world’s top 15 AI nations, alongside the United States, China, Japan, Germany, and France, and stands as a testament to FPT’s ongoing efforts to enhance the country’s global tech presence. FPT has announced plans to establish three more AI Factories globally within the next five years, contributing to Vietnam’s ambition to lead the region in AI computing infrastructure. “Developed under the philosophy of ‘Build Your Own AI’, FPT AI Factory is not merely a leap in high-performance computing infrastructure, but also a solution to a core market bottleneck: making AI more accessible and applicable across all areas of life. With FPT AI Factory, any organization, business, or individual in Vietnam, Japan, and all over the globe can develop AI tailored to their own needs, gain unique competitive advantages, and accelerate comprehensive digital transformation,” said Mr. Le Hong Viet, CEO of FPT Smart Cloud, FPT Corporation. [caption id="attachment_63358" align="aligncenter" width="800"] FPT’s multi-region AI factory is globally certified, demonstrating readiness to accelerate AI innovation worldwide (Source: FPT)[/caption] Launched in November 2024, FPT AI Factory has been chosen by leading tech companies, such as LandingAI, to build advanced AI solutions that deliver real-world impact. FPT Corporation (FPT) is a leading global technology and IT services provider headquartered in Vietnam. FPT operates in three core sectors: Technology, Telecommunications, and Education. As AI is indeed a key focus, FPT has been integrating AI across its products and solutions to drive innovation and enhance user experiences within its Made by FPT ecosystem. FPT is actively expanding its capabilities in AI through investments in human resources, R&D, and partnerships with leading organizations, including NVIDIA, Mila, and AITOMATIC. These efforts are aligned with FPT's ambitious goal to solidify its status among the world's top billion-dollar IT companies.  For more information, please visit https://fpt.com/en.

Continual Pre-training of Llama-3.2-1B with FPT AI Studio

14:35 16/06/2025
I. Introduction Large Language Models (LLMs) have transformed artificial intelligence by enabling machines to understand and generate human-like text. These models are initially pre-trained on vast datasets to grasp general language patterns. However, as new information emerges, like scientific discoveries or trending topics, models can become outdated. Continual pretraining addresses this by updating pretrained LLMs with new data, avoiding the need to start from scratch.  This blog post dives into continual pretraining, exploring its mechanics, challenges, and benefits. We’ll also show how FPT AI Studio supports this process through a practical experiment. As continual pretraining demands significant compute resources and streamlined workflows, having the right platform is critical.  Built on the NVIDIA-powered FPT AI Factory, FPT AI Studio provides an unified platform with flexible GPU options, built-in security, and zero infrastructure setup. These capabilities make it easier and faster to run complex training workflows at scale.  By the end, you’ll understand why continual pre-training is essential and how FPT AI Studio can help keep LLMs adaptable and relevant. II. Continual Pretraining in LLMs 1. What Is Pretraining for LLMs?  Pre-training is the foundation of LLMs, where models are trained on massive, diverse datasets like web texts, books, or articles. This process helps them learn language structure and semantics. By predicting the next word, models leverage vast unlabeled data. The result is a versatile model ready for tasks like chatbots or content generation. 2. Pretraining Challenges Computational Resources: Training requires thousands of GPUs, consuming significant energy and funds.  Data Quality: Datasets must be diverse and unbiased to avoid skewed outputs, which can raise ethical concerns.  Scalability: Managing large datasets and models is complex, demanding efficient systems.  Obsolescence: Pretrained models can quickly become outdated as new knowledge emerges. 3. From Pretraining to Continual Pretraining Traditional pretraining is a one-time effort, but the world doesn’t stand still. New trends, research, and language patterns emerge constantly. Continual pretraining updates the models incrementally, allowing them to adapt to new domains or information without losing existing knowledge. This approach saves resources compared to full retraining and keeps models relevant in dynamic fields like medicine or technology. 4. What Is Continual Pretraining? Continual pretraining involves further training a pretrained LLM on new or domain-specific data to enhance its knowledge. Unlike fine-tuning, which targets specific tasks, continual pretraining broadens general capabilities. It uses incremental learning to integrate new data while preserving prior knowledge, often through techniques to balance retention and adaptation. For example, a model might be updated with recent news or scientific papers to stay current. 5. Continual Pretraining Challenges Catastrophic Forgetting: New training can overwrite old knowledge, reducing performance on previous tasks.  Data Selection: Choosing high-quality, relevant data is critical to avoid noise or bias.  Model Stability: Models must remain robust, necessitating careful monitoring. 6. Use Cases Continual pretraining shines in various scenarios:   Domain Adaptation: Continual pretraining allows these models to be further trained on domain-specific corpora, such as clinical notes, legal contracts, or financial reports, thereby enhancing their ability to understand and generate more accurate, relevant, and trustworthy content in those areas.   Knowledge Updates: Language models trained on static datasets can quickly become outdated as new events unfold, technologies emerge, or scientific discoveries are made. Continual pretraining enables periodic or real-time integration of up-to-date information, keeping the model aligned with the latest developments. This is especially useful for any task where current knowledge is essential.  Multilingual Enhancement: Many language models initially support only a limited set of widely spoken languages. Continual pretraining provides a pathway to extend these models with the low-resource languages, regional dialects, or even domain-specific jargon within a language. This ensures broader accessibility and inclusiveness, making the technology usable by a more diverse global population. 7. Why Not Just Fine-Tune? Fine-tuning focuses on instruction-tuning the model across a series of downstream tasks that may differ in data distribution or change over time. This typically uses labeled datasets, such as question-answer pairs, to guide the model toward performing specific, well-defined tasks more effectively.  Fine-tuning adapts models for specific tasks but has limitations:   Task-Specificity: It may not generalize to broad knowledge updates.   Overfitting Risk: Models can overfit to small datasets, losing versatility.  III. Continual Pretraining on FPT AI Studio  With growing interest in Vietnamese LLMs, we conducted a real-world continual pretraining experiment using FPT AI Studio — a powerful no-code platform developed by FPT Smart Cloud. FPT AI Studio provides a streamlined platform for managing and executing LLM Training workflows, including the continual pretraining of LLMs. Its advantages include:  A user-friendly graphical interface for pipeline creation and management.   Integrated data management through Data Hub, allowing easy connection to S3 buckets to upload the large dataset.   Simplified configuration of computing resources and hyperparameters.   Address any difficult issues that commonly arise during LLM training with Model Fine-tuning.  Store the trained model safely in Model Hub.  Clear tracking and monitoring of training jobs.  In this blog, we will continue the training of meta-llama/Llama-3.2-1B with the aim of enhancing its performance in the Vietnamese language. The continual pretraining is carried out on FPT AI Studio. 1. Prepare the dataset We continue pretraining on a Vietnamese dataset to enhance the language capabilities of the LLM. The Vietnamese datasets used include:  bkai-foundation-models/BKAINewsCorpus - 5.2GB  vietgpt/wikipedia_vi - 5.6GB  Uonlp/CulturaX (Vietnamese subset) - 6GB  Ontocord/CulturaY (Vietnamese subset) - 2.4GB  10,000 Vietnamese Books - 1.7GB  This brings the total dataset size to 20.9GB, with each sample saved in .txt format. We allocate 0.1% of the data as the evaluation set, and the remaining ~99.9% as the training set (~2.8 billion tokens).  Both the training and evaluation sets are saved in .jsonl files, following FPT AI Studio's LLM training format. To use them for training, upload the .jsonl files to your S3 bucket and connect it to Data Hub.   Figure 1. Example of the .jsonl file format required by FPT AI Studio's LLM training.  To connect your S3 bucket to FPT AI Studio Data Hub:  Step 1: Go to the Data Hub tab.  Step 2: Click Create Connection.  Step 3: Fill in the S3 configuration details.  Step 4: Click Save.  Figure 2. Create Connection dialog in Data Hub. 2. Start the training We use 8 x GPU NVIDIA H100 SXM5 (128CPU - 1536GB RAM - 8xH100) and the above prepared data for continual pretraining, with the following hyperparameters:  [code lang="js"] { "batch_size": 8, "checkpoint_steps": 1000, "checkpoint_strategy": "epoch", "disable_gradient_checkpointing": false, "distributed_backend": "ddp", "dpo_label_smoothing": 0, "epochs": 2, "eval_steps": 1000, "eval_strategy": "epoch", "finetuning_type": "full", "flash_attention_v2": false, "full_determinism": false, "gradient_accumulation_steps": 16, "learning_rate": 0.00004, "logging_steps": 10, "lora_alpha": 32, "lora_dropout": 0.05, "lora_rank": 16, "lr_scheduler_type": "linear", "lr_warmup_steps": 0, "max_grad_norm": 1, "max_sequence_length": 2048, "mixed_precision": "bf16", "number_of_checkpoints": 1, "optimizer": "adamw", "pref_beta": 0.1, "pref_ftx": 0, "pref_loss": "sigmoid", "quantization_bit": "none", "save_best_checkpoint": false, "seed": 1309, "simpo_gamma": 0.5, "target_modules": "all-linear", "weight_decay": 0, "zero_stage": 1 } [/code] Setting up the Training Pipeline in FPT AI Studio:  Step 1: Create Pipeline: In FPT AI Studio, navigate to Pipeline Management and click Create Pipeline.  Figure 3. Pipeline Management interface in the Model Fine-tuning.  Step 2: Choose Template: Select the Blank template and click Let’s Start.  Figure 4. The "Choose Template" dialog within FPT AI Studio's pipeline creation process.  Step 3: Configure Base Model & Data: Fill in the information about the base model and dataset, then click Next Step.  Figure 5. "Base model & Data" of the "Create Pipeline".  Step 4: Configure Training Configuration: Select Pre-training from the Built-in Trainer dropdown, toggle Advanced, and paste the provided JSON. For Infrastructure, choose Single-node with 8 x GPU NVIDIA H100 SXM5 (128CPU - 1536GB RAM). Click Next Step.   Figure 6. "Training Configuration" of the "Create Pipeline".  Step 5: Configure Others: No test data is required due to pretraining, check the Send Email option to receive notifications upon completion, and then click Next Step.  Figure 7. "Others" of the "Create Pipeline".  Step 6: Review and Submit: Enter a name and a brief description for the pipeline run, carefully review all configured settings and click Submit.  Figure 8. "Review" of the "Create Pipeline".  Step 7: Start training:  ‘Start’ the training pipeline to begin the training process.  Figure 9. Start the training pipeline in FPT AI Studio.   During training, you can track metrics such as loss, eval loss, and learning rate in the Model Metrics tab.  Figure 10. Training metrics.  After training, the continually pretrained model is saved under Model Hub → Private Model. We can download this model for personal use.  Figure 11. The continually pretrained model in Model Hub. 3. Results After training, the loss decreased from 2.8746 to 1.9966, and the evaluation loss dropped to 2.2282, indicating that the model has effectively adapted to the Vietnamese language.  Figure 12. Loss and Eval Loss metrics during the continual pretraining process.  We evaluated the continually pretrained model on the Vietnamese benchmark ViLLM-Eval using EleutherAI EvalHarness. The results were compared against the base model. Across all tasks, the metrics showed consistent improvements - some of them substantial. For instance, on the lambada_vi task, accuracy increased from 0.2397 to 0.3478, an improvement of nearly 11%.  Model  comprehension_vi  exams_vi  lambada_vi  wikipediaqa_vi  Baseline  0.6156  0.2912  0.2397  0.321  Continued Pretraining  0.6178  0.3318  0.3478  0.397  Table 1. Performance (Accuracy) comparison of the baseline Llama-3.2-1B model and the continually pretrained model on Vietnamese benchmark tasks from VILLM-Eval.  In addition, we analyzed results on various subsets of exams_vi, covering subjects like math, physics, biology, literature and more in Vietnamese. The continually pretrained model demonstrated clear improvements over the baseline in every subject area.  Model  exams_vi _dia  exams_vi _hoa  exams_vi_su    exams_vi_sinh    exams_vi_toan    exams_vi_van    exams_vi_vatly    Baseline  0.3235  0.2522  0.2897  0.2819  0.2572  0.3192  0.2976  Continued Pretraining  0.3791  0.2609  0.3563  0.3113  0.2653  0.3662  0.3  Table 2. Detailed performance (accuracy) comparison on various subject area subsets of the exams_vi between the baseline and continually pretrained model.  These improvements demonstrate the feasibility of building high-performing Vietnamese LLMs with minimal overhead — opening the door for domain-specific applications in fintech, edtech, and more. IV. Conclusion As language and knowledge evolve at breakneck speed, Large Language Models must keep up—or risk becoming obsolete. Continual pretraining emerges as a vital solution, enabling models to seamlessly integrate new data while preserving previously learned knowledge. Unlike traditional pretraining or task-specific fine-tuning, this approach offers a scalable path to sustained performance across dynamic domains like healthcare, finance, education, and especially low-resource languages like Vietnamese.  Our experiment using FPT AI Studio demonstrated that continual pretraining is not only feasible but highly effective. By training Llama-3.2-1B on curated Vietnamese datasets, we achieved substantial performance gains across multiple benchmarks—proving that with the right tools, high-quality Vietnamese LLMs are within reach.  What sets FPT AI Studio apart is the seamless, end-to-end experience. From integrating datasets with Data Hub to orchestrating powerful GPUs and managing pipelines efficiently, FPT AI Studio removes complexity and helps your team focus on what matters most: improving your models and delivering impact faster. Whether you're developing a domain-specific chatbot, enhancing multilingual capabilities, or putting LLMs into production, FPT AI Studio provides the tools, infrastructure, and flexibility to help you build your own AI with confidence.   

Run Jupyter Notebook on GPU Container with a Few Clicks

11:55 16/06/2025
1. Introduction to the GPU Container Overview of GPU Container GPU Container is a preconfigured, enterprise-grade solution designed to accelerate machine learning and data science workflows. GPU Container enables users to deploy GPU-accelerated containers that seamlessly integrate with Jupyter Notebook, providing a robust environment for developing and training machine learning models. By leveraging NVIDIA GPUs and Containerization technology, this feature delivers high-performance computing with both flexibility and scalability.  Key Benefits of GPU Container Platform High-Performance Computing: Utilize 1 to 8 NVIDIA GPUs to drastically reduce training times for complex models.  Environment Consistency: Preconfigured containers ensure reproducible setups across diverse projects, eliminating configuration conflicts.  Interactive Workflow: Jupyter Notebook provides a sophisticated interface for coding, visualization, and documentation.  Resource Optimization: Allocate GPU resources precisely, enhancing cost efficiency for scalable deployments.  Framework Versatility: Support built-in images and custom images for specialized needs.  This guide introduces  GPU Container and provides a comprehensive, step-by-step guide for setting up and using it to develop machine learning models with Jupyter Notebook.  2. Spinning up your GPU Container Prerequisites: AI Factory account with at least $5 in your account so you can experiment GPU Container without interuption in 1 or 2 hours  GPU Container is now available on FPT AI Factory, follow the steps below to spin up your GPU Container on FPT AI Factory .  Visit ai.fptcloud.com and switch to GPU Container dashboard, you can see details about your running, paused and stopped containers, all in one place.  Step 1: Select a GPU Flavor GPU Container feature allows you to choose a flavor based on your computational needs. You can select from 1 to 8 H100 NVIDIA GPUs to match your workload requirements. For example, choose 1 GPU for lightweight tasks or 8 GPUs for large-scale model training.  In the AI Factory interface, navigate to the Create New Container section.  Select the desired GPU configuration (1–8 GPUs) from the configuration panel.  Step 2: Choose a Container Image The platform offers multiple built-in images optimized for machine learning, as well as support for custom images. You can choose from preconfigured images provided within the platform. To select a built-in image, navigate to Template panel.   Alternatively, you can specify a custom Docker image if your project requires specific libraries or configurations by selecting Custom Template button and filling in URL to your custom image.  Step 3: Configure Environment Variables and Startup Commands Customize the container’s behavior by setting environment variables and startup commands, if necessary:  Environment Variables: Specify variables such as USERNAME and PASSWORD for Jupyter access or framework-specific settings.  Startup Command: Override the default command if needed (e.g., jupyter notebook --ip=0.0.0.0 --port=8888 for Jupyter). For this guide, we use the default settings for the built-in Jupyter Notebook image, which automatically starts Jupyter Notebook.  Step 4: Launch and Wait for the Container Ensuring that all configuration and pricing for your GPU Container instance meets your requirements. If everything looks good, select Create Container to spin up your container.  Wait for the container to be ready. The platform will provision the container, allocate the selected GPUs, and start the Jupyter Notebook server. You can track your container’s status on the GPU Container dashboard.   Alternatively, you can click on the name of your container on the dashboard to get more information about your set up. Once ready, the platform provides a direct link to access the Jupyter interface.   Click on the provided endpoint to gain access to Jupyter Notebook.  3. Training YOLOv11 with Jupyter Notebook in the GPU Container This section demonstrates how to upload an existing Jupyter Notebook to the GPU Container and train a YOLOv11 model for brain tumor detection.  We have prepared a notebook to demonstrate the Machine Learning/Deep Learning capability of GPU Container. Please verify the current setup.  1 H100 NVIDIA GPU is currently available in the container.  The Jupyter Notebook interface is now accessible and ready for interaction. Proceed to install the necessary packages required for model training.  After completing package installation, continue by initiating the training process.  Execute the following code snippet to start model training:  [code lang="js"] from ultralytics import YOLO  # Load a model  model = YOLO("yolo11m.pt")  # load a pretrained model (recommended for training)  # Train the model  results = model.train(data="brain-tumor.yaml", epochs=10, imgsz=640, workers=0)  [/code] Model training is now in progress using the specified parameters.  After training completes, validate the model performance by running inference on a sample input. The model completed training successfully without error. Export the trained model for deployment or future inference tasks.  Execute the following command to export the model in ONNX format:  [code lang="js"] from ultralytics import YOLO  # Load a model  model = YOLO("runs/detect/train/weights/best.pt")  # load a custom trained model   # Export the model  model.export(format="onnx")  [/code] Once the model is exported, it can be downloaded to a local environment for further use.  4. Conclusion Summary of the Workflow  This guide has demonstrated how to:  Spin up a GPU Container on FPT AI Factory by selecting a GPU configuration, choosing an image, and configuring environment variables.  Upload a Jupyter Notebook to the container and execute a machine learning model.  Persist models and notebooks for ongoing use.  GPU Container delivers a high-performance, scalable solution for machine learning, combining GPU acceleration with the flexibility of Containerization and Jupyter Notebook for enterprise-grade applications.  Recommended Next Steps  Maximize your experience with GPU Container:  Integrate with Visual Studio Code: Use the Remote - Containers extension for a unified development environment.  Serve LLM with vLLM: Enhance your workflow by integrating vLLM, a high-performance library for running large language models efficiently.  Explore Advanced Frameworks: Leverage the platform’s support for PyTorch and custom images to address complex machine learning challenges.  Deploy your model: after training and evaluation, your model can serve in a production environment. You can deploy your model with our Model Hub service.  Leverage the GPU Container feature to accelerate your machine learning workflows in enterprise environments.    

FPT AI Factory: A Powerful AI SOLUTION Suite with NVIDIA H100 and H200 Superchips

13:37 10/06/2025
In the booming era of artificial intelligence (AI), Viet Nam is making a strong mark on the global technology map through the strategic collaboration between FPT Corporation and NVIDIA – the world’s leading provider of high-performance computing solutions, to develop FPT AI Factory, a comprehensive suite for end-to-end AI. This solution is built on the world’s most advanced AI technology, NVIDIA H100 and NVIDIA H200 superchips. Video: Mr. Truong Gia Binh (Chairman of FPT Corporation) discusses the strategic cooperation with NVIDIA in developing comprehensive AI applications for businesses According to the Government News (2024), Mr. Truong Gia Binh – Chairman of the Board and Founder of FPT Corporation – emphasized that FPT is aiming to enhance its capabilities in technology research and development, while building a comprehensive ecosystem of advanced products and services based on AI and Cloud platforms. This ecosystem encompasses everything from cutting-edge technological infrastructure and top-tier experts to deep domain knowledge in various specialized fields. "We are committed to making Vietnam a global hub for AI development." 1. Overview of the Two Superchips NVIDIA H100 & H200: A New Leap in AI Computing 1.1 Information about the NVIDIA H100 Chip (NVIDIA H100 Tensor Core GPU) The NVIDIA H100 Tensor Core GPU is a groundbreaking architecture built on the Hopper™ Architecture (NVIDIA’s next-generation GPU processor design). It is not just an ordinary graphics processing chip, but a machine specially optimized for Deep Learning and Artificial Intelligence (AI) applications. [caption id="attachment_62784" align="aligncenter" width="1200"] Chip NVIDIA H100 (GPU NVIDIA H100 Tensor Core)[/caption]   The NVIDIA H100 superchip is manufactured using TSMC's advanced N4 process and integrates up to 80 billion transistors. Its processing power comes from a maximum of 144 Streaming Multiprocessors (SMs), purpose-built to handle complex AI tasks. Notably, the NVIDIA Hopper H100 delivers optimal performance when deployed via the SXM5 socket. Thanks to the enhanced memory bandwidth provided by the SXM5 standard, the H100 offers significantly superior performance compared to implementations using conventional PCIe sockets—an especially critical advantage for enterprise applications that demand large-scale data handling and high-speed AI processing. [caption id="attachment_62802" align="aligncenter" width="1363"] NVIDIA H100 Tensor Core GPUs: 9x faster AI training and 30x faster AI inference compared to the previous generation A100 in large language models[/caption]   NVIDIA has developed two different form factor packaging versions of the H100 chip: the H100 SXM and H100 NVL, designed to meet the diverse needs of today’s enterprise market. The specific use cases for these two versions are as follows: H100 SXM version: Designed for specialized systems, supercomputers, or large-scale AI data centers aiming to fully harness the GPU’s potential with maximum NVLink scalability. This version is ideal for tasks such as training large AI models (LLMs, Transformers), AI-integrated High Performance Computing (HPC) applications, or exascale-level scientific, biomedical, and financial simulations. H100 NVL version: Optimized for standard servers, this version is easily integrated into existing infrastructure with lower cost and complexity compared to dedicated SXM systems. It is well-suited for enterprises deploying real-time AI inference, big data processing, Natural Language Processing (NLP), computer vision, or AI applications in hybrid cloud environments. Product Specifications H100 SXM H100 NVL FP64 34 teraFLOPS 30 teraFLOP FP64 Tensor Core 67 teraFLOPS 60 teraFLOP FP32 67 teraFLOPS 60 teraFLOP TF32 Tensor Core* 989 teraFLOPS 835 teraFLOP BFLOAT16 Tensor Core* 1.979 teraFLOPS 1.671 teraFLOPS FP16 Tensor Core* 1.979 teraFLOPS 1.671 teraFLOPS FP8 Tensor Core* 3.958 teraFLOPS 3.341 teraFLOPS INT8 Tensor Core* 3.958 TOPS 3.341 TOPS GPU Memory 80GB 94GB GPU Memory Bandwidth 3,35TB/s 3,9TB/s Decoders 7 NVDEC 7 JPEG 7 NVDEC 7 JPEG Max Thermal Design Power (TDP) Up to 7 MIGS @ 10GB each 350 - 400W (adjustable) Multi-Instance GPUs) Up to 7 MIGS @ 10GB each Up to 7 MIGS @ 12GB each Form Factor SXM PCIe dual-slot air-cooled Interconnect NVIDIA NVLink™: 900GB/s PCIe Gen5: 128GB/s NVIDIA NVLink: 600GB/s PCIe Gen5: 128GB/s Server Options NVIDIA HGX H100 Partner and NVIDIA- Certified Systems™ with 4 or 8 GPUs NVIDIA DGX H100 with 8 GPUs Partner and NVIDIA-Certified Systems with 1 – 8 GPUs NVIDIA AI Enterprise Optional Add-on Included Table 1.1: Specification Table of the Two H100 Chip Form Factors – H100 SXM and H100 NVL 1.2 Information about the NVIDIA H200 Chip (NVIDIA H200 Tensor Core GPU) [caption id="attachment_62803" align="aligncenter" width="1200"] Information about the NVIDIA H200 Chip (NVIDIA H200 Tensor Core GPU) including both form factors: H200 SXM and H200 NV[/caption] Building upon and advancing the Hopper™ architecture, the NVIDIA H200 Tensor Core GPU is a powerful upgrade of the H100, introduced by NVIDIA as the world’s most powerful AI chip, delivering results twice as fast as the H100 at the time of its launch in November 2023. The H200 is designed to handle even larger and more complex AI models, especially generative AI models and large language models (LLMs). Similar to the H100 superchip, NVIDIA also offers two different form factors for its H200 Tensor Core product, both designed for enterprise use: the H200 SXM and H200 NVL versions. NVIDIA H200 SXM: Designed to accelerate generative AI tasks and high-performance computing (HPC), especially with the capability to process massive amounts of data. This is the ideal choice for dedicated systems, supercomputers, and large AI data centers aiming to fully leverage the GPU’s potential with maximum NVLink scalability. Enterprises should use the H200 SXM for scenarios such as training extremely large AI models, HPC applications requiring large memory, and enterprise-level generative AI deployment. NVIDIA H200 NVL: Optimized to bring AI acceleration capabilities to standard enterprise servers, easily integrating into existing infrastructure. This version is particularly suitable for enterprises with space constraints needing air-cooled rack designs with flexible configurations, delivering acceleration for all AI and HPC workloads regardless of scale. Use cases for H200 NVL in enterprises include real-time AI inference, AI deployment in hybrid cloud environments, big data processing, and natural language processing (NLP). Product Specifications H200 SXM H200 NVL FP64 34 TFLOPS 30 TFLOPS FP64 Tensor Core 67 TFLOPS 60 TFLOPS FP32 67 TFLOPS 60 TFLOPS TF32 Tensor Core² 989 TFLOPS 835 TFLOPS BFLOAT16 Tensor Core² 1.979 TFLOPS 1.671 TFLOPS FP16 Tensor Core² 1.979 TFLOPS 1.671 TFLOPS FP8 Tensor Core² 3.958 TFLOPS 3.341 TFLOPS INT8 Tensor Core² 3.958 TFLOPS 3.341 TFLOPS GPU Memory 141GB 141GB GPU Memory Bandwidth 4,8TB/s 4,8TB/s Decoders 7 NVDEC 7 JPEG 7 NVDEC 7 JPEG Confidential Computing Supported Supported TDP Up to 700W (customizable) Up to 600W (customizable) Multi-Instance GPUs Up to 7 MIGs @18GB each Up to 7 MIGs @16.5GB each Form Factor SXM PCIe Dual-slot air-cooled Interconnect NVIDIA NVLink™: 900GB/s PCIe Gen5: 128GB/s 2- or 4-way NVIDIA NVLink bridge: 900GB/s per GPU PCIe Gen5: 128GB/s Server Options NVIDIA HGX™ H200 partner and NVIDIA-Certified Systems™ with 4 or 8 GPUs NVIDIA MGX™ H200 NVL partner and NVIDIA-Certified Systems with up to 8 GPUs NVIDIA AI Enterprise Add-on Included Table 1.2: Technical specifications of the two form factors, H200 SXM and H200 NVL 1.3 Detailed Comparison Between NVIDIA H100 and NVIDIA H200 Superchips [caption id="attachment_62804" align="aligncenter" width="1200"] The differences between the two Superchips: H100 and H200 across SXM - NVL form factors, especially in building AI infrastructure and applications for enterprises[/caption] Based on the information regarding the two NVIDIA products, H100 (H100 SXM - H100 NVL) and H200 (H200 SXM - H200 NVL), provided by FPT Cloud, here is a detailed comparison table between NVIDIA H100 & H200 for your reference: Features NVIDIA H100 (SXM) NVIDIA H100 (NVL) NVIDIA H200 (SXM) NVIDIA H200 (NVL) Architecture Hopper™ Hopper™ Inheriting and evolving from Hopper™" Inheriting and evolving from Hopper™" Manufacturing Process TSMC N4 (integrating 80 billion transistors) TSMC N4 (integrating 80 billion transistors) An upgraded version of H100 An upgraded version of H100 FP64 34 teraFLOPS 30 teraFLOP 34 TFLOPS 30 TFLOPS FP64 Tensor Core 67 teraFLOPS 60 teraFLOP 67 TFLOPS 60 TFLOPS FP32 67 teraFLOPS 60 teraFLOP 67 TFLOPS 60 TFLOPS TF32 Tensor Core 989 teraFLOPS 835 teraFLOP 989 TFLOPS 835 TFLOPS BFLOAT16 Tensor Core 1.979 teraFLOPS 1.671 teraFLOPS 1.979 TFLOPS 1.671 TFLOPS FP16 Tensor Core 1.979 teraFLOPS 1.671 teraFLOPS 1.979 TFLOPS 1.671 TFLOPS FP8 Tensor Core 3.958 teraFLOPS 3.341 teraFLOPS 3.958 TFLOPS 3.341 TFLOPS INT8 Tensor Core 3.958 TFLOPS 3.341 TFLOPS 3.958 TFLOPS 3.341 TFLOPS GPU Memory 80GB 94GB 141GB 141GB GPU Memory Bandwidth 3.35TB/s 3.9TB/s 4.8TB/s 4.8TB/s Decoders 7 NVDEC, 7 JPEG 7 NVDEC, 7 JPEG 7 NVDEC, 7 JPEG 7 NVDEC, 7 JPEG Confidential Computing No information available regarding Confidential Computing No information available regarding Confidential Computing Supported Supported Max Thermal Design Power - TDP Up to 700W (user-configurable) 350 - 400W (configurable) Up to 700W (user-configurable) Up to 600W (customizable) Multi-Instance GPUs Up to 7 Multi-Instance GPU (MIG) partitions, each with 10GB Up to 7 Multi-Instance GPU (MIG) partitions, each with 12GB Up to 7 Multi-Instance GPU (MIG) partitions, each with 18GB Up to 7 Multi-Instance GPU (MIG) partitions, each with 16.5GB Form Factor SXM PCIe interface, with a dual-slot, air-cooled design SXM PCIe interface, with a dual-slot, air-cooled design Interconnect NVIDIA NVLink™: 900GB/s;; PCIe Gen5: 128GB/s NVIDIA NVLink: 600GB/s;; PCIe Gen5: 128GB/s NVIDIA NVLink™: 900GB/s; PCIe Gen5: 128GB/s NVIDIA NVLink 2- or 4-way bridge: 900GB/s per GPU; PCIe Gen5: 128GB/s Server Options NVIDIA HGX H100 Partner and NVIDIA-Certified Systems™ with 4 or 8 GPUs; NVIDIA DGX H100 with 8 GPUs Compatible with Partner and NVIDIA-Certified Systems supporting 1 to 8 GPUs Supported on NVIDIA HGX™ H200 Partner Systems and NVIDIA-Certified Platforms featuring 4 or 8 GPUs NVIDIA MGX™ H200 NVL Partner & NVIDIA-Certified Systems (up to 8 GPUs) NVIDIA AI Enterprise Add-on Included Add-on Included Table 1.2: Detailed comparison table between NVIDIA H100 (SXM - NVL) and NVIDIA H200 (SXM - NVL) 2. FPT strategically partners with NVIDIA to develop the first AI Factory in Vietnam The strategic synergy between NVIDIA, a leading technology company, and FPT's extensive experience in deploying enterprise solutions has forged a powerful alliance in developing pioneering AI products for the Vietnamese market. NVIDIA not only supplies its cutting-edge NVIDIA H100 and H200 GPU superchips but also shares profound expertise in AI architecture. For FPT Corporation, FPT Smart Cloud will be the trailblazing entity to provide cloud computing and AI services built upon the foundation of this AI factory, enabling Vietnamese enterprises, businesses, and startups to easily access and leverage the immense power of AI. [caption id="attachment_62805" align="aligncenter" width="2560"] FPT Corporation is a strategic partner of NVIDIA in building and developing the FPT AI Factory solutions: FPT AI Infrastructure, FPT AI Studio, and FPT AI Inference[/caption]   Notably, FPT will concentrate on developing Generative AI Models, offering capabilities for content creation, process automation, and solving complex problems that were previously challenging to address. In the era of burgeoning AI technologies, B2B enterprises across all sectors—from Finance, Securities, and Insurance to Manufacturing and Education are facing a pressing need for a reliable partner to achieve digital transformation breakthroughs. FPT AI Factory from FPT Cloud is the optimal solution, offering your business the following outstanding advantages: Leading AI Infrastructure: By directly utilizing the NVIDIA H100 and H200 superchips, FPT AI Factory delivers a powerful AI computing platform, ensuring superior performance and speed for all AI tasks. Diverse Service Ecosystem: FPT AI Factory is not just hardware but a comprehensive ecosystem designed to support businesses throughout the entire AI solution lifecycle—from development and training to deployment. Cost Optimization: Instead of investing millions of dollars in complex AI infrastructure, businesses can leverage FPT AI Factory as a cloud service, optimizing both initial investment and operational costs. Security, Compliance, and Integration: FPT is committed to providing a secure AI environment that meets international security standards while also enabling seamless integration with existing enterprise systems. [caption id="attachment_62806" align="aligncenter" width="1642"] The superior advantages of the FPT AI Factory solution for businesses across various industries in the market[/caption] 3. Building a Comprehensive FPT AI Factory Ecosystem (FPT AI Infrastructure, FPT AI Studio, and FPT AI Inference) Powered by NVIDIA H100 & H200 Superchips FPT AI Factory currently offers a trio of AI solutions developed based on the core technology of NVIDIA H100 & NVIDIA H200 superchips for enterprises, including: FPT AI Infrastructure: This is the group of products related to enterprise infrastructure. FPT AI Studio: This is the group of products related to the platform of tools and services for enterprises. FPT AI Inference: This is the group of products related to the platform for AI (Artificial Intelligence) and ML (Machine Learning) models for enterprises. Video: FPT’s trio of AI solutions — FPT AI Infrastructure, FPT AI Studio, and FPT AI Inference — enables businesses to build, train, and operate AI solutions simply, easily, and effectively. 3.1 FPT AI Infrastructure Solution [caption id="attachment_62807" align="aligncenter" width="1528"] The FPT AI Infrastructure solution enables businesses to deploy high-performance computing infrastructure, develop AI solutions, and easily scale according to demand[/caption] FPT AI Infrastructure is a robust cloud computing infrastructure platform, specially optimized for AI workloads. It provides superior computing power from NVIDIA H100 and H200 GPUs, enabling enterprises to build supercomputing infrastructure, easily access and utilize resources to train AI models rapidly, and flexibly scale according to their needs using technologies such as Meta Cloud, GPU Virtual Machine, Managed CPU Cluster, and GPU Container. Register for FPT AI Infrastructure today to build and develop powerful infrastructure for your business! 3.2 The FPT AI Studio Product [caption id="attachment_62808" align="aligncenter" width="1587"] The FPT AI Studio product helps businesses process data, develop, train, evaluate, and deploy artificial intelligence and machine learning models based on their specific needs[/caption] Once a business has established an infrastructure system with advanced GPU technology, the next step is to build and develop its own artificial intelligence and machine learning models tailored to specific operational and application needs. FPT AI Studio is the optimal solution for this. It is a comprehensive AI development environment that offers a full suite of tools and services to support businesses throughout the entire process from data processing, model development, training, evaluation, to deployment of real-world AI/ML models—using cutting-edge technologies such as Data Hub, AI Notebook, Model Pre-training, Model Fine-tuning, and Model Hub. Register now to start building and deploying AI and Machine Learning models for your business today! 3.3 The FPT AI Inference Service [caption id="attachment_62809" align="aligncenter" width="1580"] FPT AI Inference service enhances the inference capabilities for enterprises' AI and Machine Learning models[/caption] Once an enterprise's AI or Machine Learning model has been trained using internal and other crucial data, deploying and operating it in a real-world environment demands an efficient solution. FPT AI Inference is the intelligent choice for your business. This solution is optimized to deliver high inference speed and low latency, ensuring your AI models can operate quickly and accurately in real-world applications such as virtual assistants, customer consultation services, recommendation systems, image recognition, or natural language processing, powered by advanced technologies like Model Serving and Model-as-a-Service. This is the final piece in the FPT AI Factory solution suite, helping enterprises to put AI into practical application and deliver immediate business value. Enhance the inference capabilities and real-world applications of your enterprise AI models today with FPT AI Inference! 4. Exclusive offer for customers registering to experience FPT AI Factory on FPT Cloud [caption id="attachment_62810" align="aligncenter" width="1312"] Special benefits for businesses when registering to use FPT AI Factory services as early as possible[/caption] Exclusive incentives from FPT Cloud just for you when you register early to experience the comprehensive AI Factory solution trio: FPT AI Infrastructure, FPT AI Studio, and FPT AI Inference today:  Priority access to FPT AI Infrastructure services at preferential pricing: Significantly reduce costs while accessing world-class AI infrastructure, tools, and applications—right here in Vietnam. Early access to premium features of FPT AI Factory: Ensure your business stays ahead by being among the first to adopt the latest AI technologies and tools in the digital transformation era. Receive Cloud credits to explore a diverse AI & Cloud ecosystem: Experience other powerful FPT Cloud solutions that enhance operational efficiency, such as FPT Backup Services, FPT Disaster Recovery, and FPT Object Storage. Gain expert consultation from seasoned AI & Cloud professionals: FPT’s AI and Cloud specialists will support your business in applying and operating the FPT AI Factory solution suite effectively, driving immediate business impact. Register now to receive in-depth consultation on the FPT AI Factory solution from FPT Cloud’s team of experienced AI & Cloud experts! [caption id="attachment_62811" align="aligncenter" width="963"] Registration Form for Expert AI & Cloud Consultation on FPT AI Factory's Triple Solution Suite for Enterprises[/caption]

LandingAI – Agentic Vision Technologies Leader from Silicon Valley – Leverages FPT AI Factory to Accelerate Visual AI Platform

17:09 03/06/2025
LandingAI, a Silicon Valley-based leader in agentic vision technologies founded by Dr. Andrew Ng, is leveraging FPT AI Factory services to accelerate the development of its tools, including Agentic Document Extraction, Agentic Object Detection, and VisionAgent. Through this partnership, LandingAI utilizes Metal Cloud, powered by NVIDIA H100 Tensor Core GPUs, to meet the growing demand for high-performance computing, scalability, and operational efficiency. LandingAI is redefining visual intelligence with its tools, applying an agentic AI framework designed to help users solve complex visual tasks using unstructured data such as images and documents. The system intelligently selects and orchestrates vision models and generates deployable code to automate similar tasks in the future. A key challenge in developing the Visual AI platform lies in the need for substantial computing resources to fine-tune the agents, run reinforcement learning loops, and drive continuous performance improvement while ensuring rapid iteration speed to keep pace with innovation. Tackling Computational Challenges with Metal Cloud FPT AI Factory offers the critical infrastructure needed to fast-track the development of the Visual AI platform and address performance complexities. Through the partnership with FPT, LandingAI gains access to Metal Cloud - a high-performance AI infrastructure fueled by NVIDIA H100 GPUs, backed by high SLAs and continuous support by FPT’s experts.  The cutting-edge GPUs deliver the computational power necessary for supervised fine-tuning and reinforcement learning at scale, thus enabling rapid and efficient model development. The seamless integration and minimal setup friction further allow LandingAI to quickly incorporate the H100s into its training pipeline and iterate on model architectures and agent behaviors at unprecedented speed and efficiency.  In addition, LandingAI is able to expand its computing capacity while optimizing resource consumption with the competitive pricing of FPT AI Factory services. Key benefits achieved: Significant improvements in visual task generalization  3X faster deployment of customer-facing features “As LandingAI expands our agentic vision technology offerings, FPT AI Factory has provided us with a solid and flexible infrastructure for our large-scale AI development and deployment,” said Mr. Dan Maloney, CEO of LandingAI. “Their system's reliability and flexibility have streamlined our Visual AI workflows, significantly reducing iteration time. We have seen improved operational stability in production and cost savings. Their responsive support has made integration seamless.” Agentic Document Extraction Playground A Solid Foundation for Agentic AI Innovation FPT AI Factory is a full-stack ecosystem for end-to-end AI development, designed to make AI accessible, scalable, and tailored to each business’s unique goals. Powered by thousands of NVIDIA Hopper H100/H200 GPUs, combined with the latest NVIDIA AI Enterprise software platform, FPT AI Factory provides robust infrastructure, foundational models, and necessary tools for businesses to build and advance AI applications from the ground up with faster time-to-market and enterprise-grade performance at a fraction of traditional costs.  As the global demand for Agentic AI systems gains momentum to transform business task automation with minimal effort, LandingAI’s integration of FPT AI Factory demonstrates the potential of high-performance, flexible AI infrastructure to drive innovation in this fast-growing domain. These agentic systems, designed to perform complex tasks using natural language prompts, are not only reshaping automation and collaboration but also making advanced AI capabilities more approachable for developers, engineers, and business users alike.  The AI Agent market value is projected to reach $52.62 billion by 2030 with a CAGR of 46.3% from 2025 to 2030. Built on a low-code or no-code platform, AI Agents are fostering faster AI adoption and more dynamic human-AI collaboration across various sectors. The computing power and agility provided by FPT AI Factory emerge as critical enablers for businesses to enter and lead in the next era of intelligent automation.  “FPT and LandingAI share a mutual vision to democratize AI and make its powerful capabilities accessible to all. This collaboration marks another milestone in our long-term partnership to establish a strong foundation for developing next-generation AI technologies, such as Agentic AI, driving innovation and bringing tangible value across multiple industries,” shared Mr. Le Hong Viet, CEO of FPT Smart Cloud, FPT Corporation. Looking ahead, FPT is committed to continuously enhancing the FPT AI Factory to further eliminate infrastructure barriers and simplify AI development, empowering businesses to innovate faster, smarter, and more efficiently. About FPT Corporation FPT Corporation (FPT) is a global leading technology and IT services provider headquartered in Vietnam. FPT operates in three core sectors: Technology, Telecommunications, and Education. As AI is indeed a key focus, FPT has been integrating AI across its products and solutions to drive innovation and enhance user experiences within its Made by FPT ecosystem. FPT is actively working on expanding its capabilities in AI through investments in human resources, R&D, and partnerships with leading organizations like NVIDIA, Mila, AITOMATIC, and LandingAI. These efforts are aligned with FPT's ambitious goal to solidify its status among the world's top billion-dollar IT companies. For more information, please visit https://fpt.com/en. About LandingAI LandingAI™ delivers cutting-edge agentic vision technologies that empower customers to unlock the value of visual data. With LandingAI’s solutions, companies realize the value of AI and move AI projects from proof-of-concept to production.  Guided by a data-centric AI approach, LandingAI’s flagship product, LandingLens™, enables users to build, iterate, and deploy Visual AI solutions quickly and easily. LandingAI is a pioneer in agentic vision technologies, including Agentic Document Extraction and Agentic Object Detection, which enhance the ability to process and understand visual data at scale, making sophisticated Visual AI tools more accessible and efficient.  Founded by Andrew Ng, co-founder of Coursera, founding lead of Google Brain, and former chief scientist at Baidu, LandingAI is uniquely positioned to lead the development of Visual AI that benefits all. For more information, visit https://landing.ai/.