Across the globe, the fields of data science and artificial intelligence (AI) are exerting a significant influence on the design and development of modern data centers. As data volumes continue to surge, traditional data centers are becoming increasingly sluggish, resulting in suboptimal output efficiency. Incorporating AI into data centers can significantly enhance existing functions and processes, such as fault prediction and advanced modeling and simulation of yet-to-be-built data centers. Direct beneficiaries of this integration are data center operators who can substantially increase work efficiency while effectively lowering operating costs.
However, to derive meaningful results from existing deep learning models, data center operators need to constantly increase computational power and memory bandwidth. Presently, powerful general-purpose chips, such as CPUs, are no longer sufficient to support such complex deep-learning models. Therefore, AI chips capable of achieving parallel computing capabilities are increasingly gaining popularity.
① Artificial Intelligence is Transforming Data Centers
For years, data center and storage providers such as Google, Amazon, and Meta have been continuously improving their operations by utilizing AI. As a result, AI has become a reasonable investment in data center construction. Let’s take a closer look at the improvements that can be achieved with AI-enhanced data centers.
⑴ Energy Efficiency
As data centers continue to grow in size, complexity, and connectivity to the cloud, AI is becoming an essential tool for preventing device overheating and saving energy. According to the U.S. Department of Energy’s “United States Data Center Energy Usage Report,” the electricity consumption of U.S. data centers has been growing at a rate of approximately 4% per year since 2010, reaching 73 billion kilowatt-hours in 2020, which is more than 1.8% of the country’s total electricity consumption.
In addition, data centers contribute to approximately 2% of global greenhouse gas emissions. Many data centers are using AI to improve operational efficiency, especially in energy management. Here, AI can automatically monitor and adjust the power and cooling requirements of the entire data center. Publicly available data shows that Google has reduced overall energy consumption by approximately 30% to 40% by using AI to control its heating, ventilation, and air conditioning (HVAC) systems in its data centers.
⑵ Server Optimization
AI-based predictive analytics can help data center operators intelligently allocate workloads across many servers in the company. As a result, data center loads become predictable and easier to manage. Load balancing tools with built-in AI capabilities can learn from past data and more effectively run workload distribution.
⑶ Fault Prediction and Troubleshooting
AI/ML-based temperature monitoring systems have been deployed in many data centers, with hundreds of temperature sensors monitoring the health of data center equipment in real-time, such as humidity, temperature, and operational performance. The data and conclusions collected by such systems are extremely helpful for data centers to perform predictive maintenance and prevent large-scale downtime caused by emergency repairs.
⑷ Intelligent Monitoring and Storage of Data
Combining machine learning (ML), AI will replace the daily work of monitoring large amounts of data and improve the quality and efficiency of IT professionals’ task processing. AI-driven robots are now having a remarkable application in data centers, namely in the use of inspection robots. AI-powered robots can automatically replace a faulty disk without human intervention, including automatic checking, locating faulty disks, replacing disks, and charging, all completed within four minutes.
From these four aspects, it can be seen that artificial intelligence is penetrating and profoundly changing the operation of data centers. And more importantly, with AI, data center operators can add more workloads on the same physical silicon architecture, quickly aggregate and analyze data, and generate productive outputs.
These workloads are typically data-intensive and compute-intensive, with corresponding applications requiring significant computational power, driven by the training and inference workloads associated with their AI models. Therefore, AI in data centers must also have support for massive computational capabilities. All of this is nearly impossible to achieve with general-purpose chips alone, and scaling can be prohibitively expensive.
To achieve true AI in data centers, a combination of high-performance processors (CPUs), high-speed memory, and specialized hardware such as GPUs must be leveraged to efficiently process large amounts of data and support AI workloads. These specialized processors are designed to perform matrix calculations, making them particularly efficient for machine learning tasks that involve parallel processing of large amounts of data and can significantly accelerate the processing of AI workloads.
② Competition Landscape of AI Chips in Data Centers
According to Arizton’s analysis, the global data center market size was $215.8 billion in 2021 and is expected to grow at a compound annual growth rate of 4.95% to reach $288.3 billion by 2027. Another market analysis firm, P&S Intelligence, predicts that the global data center market size was estimated to be $220 billion in 2021, with a compound annual growth rate of 5.1%, and is expected to reach $343.6 billion by 2030. Although there is a slight difference in the predicted data from the two agencies, it is still evident that the data center market is a huge market with the potential of generating trillions of dollars.
Data centers are the places where enterprises store their computer, server, and network systems, as well as the infrastructure components to meet their IT needs. Servers, which are an important component of data centers, will have a significant market share. According to Industry Research, the global data center server market size was approximately $33.986 billion in 2021 and is expected to grow at a compound annual growth rate of 12.69% during the forecast period, reaching $69.598 billion by 2027.
Artificial intelligence requires significant computing power. With the increasing deployment of AI by various enterprises, end-users, cloud service providers, and even telecommunications service providers, the demand for dedicated AI processors will continue to soar in 2023, and the AI chip market will continue its growth trend over the past few years. Analysis data from McKinsey suggests that by 2025, data centers are expected to be the main revenue source for AI chips, reaching $15 billion, which is a 150% increase compared to 2017.
According to analysts at research firm Omdia, approximately 2 million servers shipped in 2023 will be equipped with at least one co-processor to accelerate computing workloads, a 53% increase compared to 2022, with a significant portion using GPUs, TPUs, and specialized AI accelerators.
Competition in the profitable data center chip market is intense. The initial core of this competition was between Intel and AMD regarding CPUs. As AI applications in data centers continue to expand, competition in the data center chip market spills over. Two years ago, Intel launched its first GPU for data centers – the Intel Server GPU. In response, GPU maker NVIDIA also introduced a CPU chip based on Arm called “Grace” to enter the server CPU market, which is expected to be launched in 2023. The booming data center industry is profoundly affecting Intel, AMD, and NVIDIA’s sales prospects and their competitive relationships with each other.
According to the 2023 Artificial Intelligence Chip report released by Reportlinker, the global AI chip market will grow from $15.65 billion in 2022 to $23.29 billion in 2023, with a compound annual growth rate (CAGR) of 48.8%. The AI chip market is expected to reach $88.85 billion in 2027, with a CAGR of 39.8%. The main participants in the AI chip market now include NVIDIA, Intel, AMD, Alphabet, Mediatek, Qualcomm, and NXP, among others. However, competition in the data center market is mainly focused on NVIDIA, Intel, and AMD.
⑴ NVIDIA DGX A100
NVIDIA invented the GPU and has been driving advances in AI, HPC, gaming, creative design, autonomous vehicles, and robotics with its steady iteration of GPUs. In May 2020, NVIDIA introduced EGX A100 and EGX Jetson, its first edge AI products based on the NVIDIA Ampere architecture, with the first EGX A10. In March 2022, NVIDIA released new DGX Station, DGX-1, and DGX-2 based on the Volta GPU architecture.
These AI supercomputers are built for deep learning training, accelerated analytics, and inference. The systems include NVIDIA’s flagship chip for data centers, the DGX A100, which integrates 8 GPUs and up to 640GB of GPU memory. The DGX A100 uses the NVIDIA A100 Tensor Core GPU and is a universal system for a variety of AI workloads. The popular ChatGPT primarily uses the NVIDIA A100 and leverages Microsoft Azure’s cloud-based resources and services. With the combined demand for ChatGPT and other Microsoft applications, it is estimated that Microsoft’s total demand for AI servers will reach around 25,000 units in 2023.
The new NVIDIA H100 Tensor Core GPU is NVIDIA’s next-generation high-performance data center GPU, designed to deliver outstanding performance, scalability, and security for every workload. Built on the NVIDIA Hopper GPU architecture, the H100 will accelerate AI training and inference, HPC, and data analytics applications in cloud data centers, servers, edge systems, and workstations. It is expected to increase the speed of large language models by up to 30 times compared to the previous generation. According to NVIDIA’s earlier information, the H100 Tensor Core GPU is planned to be released in 2023.
⑵ Intel Habana Gaudi2
After NVIDIA announced its Volta GPU architecture plan last year, Intel Habana Lab and Habana Greco announced the release of their second-generation deep learning processor, the Habana Gaudi2, in May of the same year. This processor is built for AI deep learning applications and uses advanced 7nm technology. The Gaudi 2 includes 24 Tensor cores optimized for training large-scale deep learning models. In Habana Lab’s previous processor, there were only eight Tensor cores.
In addition, the amount of SRAM and HBM2E memory included in each Gaudi 2 chip has doubled and tripled, respectively. Intel claims that Gaudi 2 offers three times the throughput of Habana’s first-generation AI training chip. In internal benchmark tests, the chip’s throughput was twice that of NVIDIA’s data center flagship A100-80GB GPU.
One of the key features of the Gaudi 2 chip is that some network components are directly integrated into the processor. This reduces the additional network hardware that data center operators must purchase, thereby reducing costs. The Gaudi 2 is equipped with 24 100-gigabit Ethernet ports, 14 more than its predecessor. Intel’s first true data center GPU, code-named Ponte Vecchio, is expected to be released in the first half of 2023.
⑶ AMD Instinct MI250X
2022 can be said to be a year of development for AI chips. In September of that year, AMD released an updated version of its Zen microarchitecture, Zen 4, based on the 5nm architecture. While AMD is mainly focused on graphics cards and GPUs and hasn’t made much noise in developing hardware specifically for AI, the company launched the Ryzen 7000 series in May of that year, which is a new series of PC processors designed specifically for machine learning and expected to further develop with the release of Zen 4.
Of course, AMD is not completely silent on data center AI chips. The AMD Instinct MI200 series accelerator is AMD’s new data center GPU, which uses innovative AMD CDNA 2 architecture, AMD Infinity Fabric technology, and advanced packaging technology. For high-performance computing workloads, the AMD Instinct MI250X has outstanding GPU performance, with up to 47.9 TFLOPS double precision (FP64), and in combination with FP64 Matrix Core technology, can achieve peak theoretical performance of up to 95.7 TFLOPS double precision (FP64 matrix). For machine learning and deep learning workloads, the MI250X can provide up to 383 TFLOPS peak theoretical half-precision (FP16) performance.
③ Future Outlook for AI in Data Centers
Artificial intelligence is becoming the driving force behind modern technology in various industries, with applications in optimization, preventive maintenance, virtual assistants, fraud detection, and anomaly detection, among others. Some even say that many data centers would not be economically or operationally feasible without AI. At the same time, data centers must provide significant computing and storage resources for AI to process large datasets in real-time and perform training and inference. Through specialized hardware such as GPUs and TPUs, data centers can accelerate complex computations, supporting AI applications and workloads.
According to TrendForce, AI servers equipped with general-purpose GPUs (GPGPUs) will only account for 1% of the world’s annual server shipments in 2022. From 2022 to 2026, AI server shipments are expected to grow at a compound annual growth rate of 10.8%. The four major North American service providers (Google, AWS, Meta, and Microsoft) occupy a significant share of the annual AI server demand in 2022, accounting for about 66.2% of the global procurement volume. The mainstream products in the server GPU market for AI-related computations include NVIDIA’s H100, A100, and A800, as well as AMD’s MI250 and MI250X series. The A800 is designed specifically for the Chinese market. NVIDIA controls about 80% of the server GPU market share, while AMD controls about 20%.
According to an IDC report, global AI spending is expected to grow by 26.9% in 2023, reaching $154 billion. By 2026, AI-centric system spending is expected to exceed $300 billion. Looking ahead, the future applications and trends of AI in data centers will be prominent. AI revitalizes data centers by improving operational efficiency, performance, and security. Data centers can benefit in multiple ways by integrating AI into their organization and operations.
2023 will be a year of significant progress in the field of artificial intelligence. In the coming years, AI’s ability to automate the entire data center will improve. By then, competition for data center AI chips will intensify, with many innovative companies expected to join the competition in addition to the three established giants.