Ai Chip Shortages Continue, But There May Be An End In Sight

serveurs

While GPUs are in high demand, they still need high-performance memory chips for AI apps. The market is tight for both - for now.

Credit: Shutterstock/Javier Pardina

As the adoption of generative artificial intelligence (genAI) continues to soar, the infrastructure to support that growth is currently running into a supply and demand bottleneck.

Sixty-six percent of enterprises worldwide said they would be investing in genAI over the next 18 months, according to IDC research. Among organizations indicating genAI will see increased IT spending in 2024, infrastructure will account for 46% of the total spend. The problem: a key piece of hardware needed to build out that AI infrastructure is in short supply.

The breakneck pace of AI adoption over the past two years has strained the industry's ability to supply the special high-performance chips needed to run the process-intensive operations of genAI and AI in general. Most of the focus on processor shortages has been on the exploding demand for Nvidia GPUs and alternatives from various chip designers such as AMD, Intel, and the hyperscale datacenter operators, according to Benjamin Lee, a professor in the Department of Computer and Information Science at the University of Pennsylvania.

"There has been much less attention focused on exploding demand for high-bandwidth memory chips, which are fabricated in Korea-based foundries run by SK Hynix," Lee said.

Last week, SK Hynix said its high-bandwidth memory (HBM) products, which are needed in combination with high-performance GPUs to handle AI processing requirements, are almost fully booked through 2025 because of high demand. The price of HBMs has also recently increased by 5% to 10%, driven by significant premiums and increased capacity needs for AI chips, according to market research firm TrendForce.

SK Hynix

HBM chips are expected to account for more than 20% of the total DRAM market value starting in 2024, potentially exceeding 30% by 2025, according to TrendForce Senior Research Vice President Avril Wu. "Not all major suppliers have passed customer qualifications for [high-performance HBM], leading buyers to accept higher prices to secure stable and quality supplies," Wu said in a research report.

Why GPUs need high-bandwidth memory

Without HBM chips, a data center server's memory system would be unable to keep up with a high-performance processor, such as a GPU, according to Lee. HBMs are what supply GPUs with the data they process. "Anyone who purchases a GPU for AI computation will also need high-bandwidth memory," Lee said.

"In other words, high-performance GPUs would be poorly utilized and often sit idle waiting for data transfers. In summary, high demand for SK Hynix memory chips is caused by high demand for Nvidia GPU chips and, to a lesser extent, associated with demand for alternative AI chips such as those from AMD, Intel, and others," he said.

"HBM is relatively new and picking up a strong momentum because of what HBM offers - more bandwidth and capacity," said Gartner analyst Gaurav Gupta. "It is different than what Nvidia and Intel sell. Other than SK Hynix, the situation for HBM is similar for other memory players. For Nvidia, I believe there are constraints, but more associated with packaging capacity for their chips with foundries."

While SK Hynix is reaching its supply limits, Samsung and Micron are ramping up HBM production and should be able to support the demand as the market becomes more distributed, according to Lee.

The current HBM shortages are primarily in the packaging from TSMC (i.e., chip-on-wafer-on-substrate or CoWoS), which is the exclusive supplier of the technology. According to Lee, TSMC is more than doubling its SOIC capacity and boosting capacity for CoWoS by more than 60%. "I expect the shortages to ease by the end of this year," he said.

At the same time, more packaging and foundry suppliers are coming online and qualifying their technology to support NVIDIA, AMD, Broadcom, Amazon, and others using TSMC's chip packaging technology, according to Lee.

Nvidia, whose production represents about 70% of the global supply of AI server chips, is expected to generate$40 billion in revenue from GPU sales this year, according to Bloomberg analysts. By comparison, competitors Intel and AMD are expected to generate$500 million and$3.5 billion, respectively. But all three are ramping production as quickly as possible.

Nvidia is tackling the GPU supply shortage by increasing its CoWoS and HBM production capacities, according to TrendForce. "This proactive approach is expected to cut the current average delivery time of 40 weeks in half by the second quarter [of 2024], as new capacities start to come online," TrendForce report said in its report. "This expansion aims to alleviate the supply chain bottlenecks that have hindered AI server availability due to GPU shortages."

Shane Rau, IDC's research vice president for computing semiconductors, said that while demand for AI chip capacity is very high, markets are adapting. "In the case of server-class GPUs, they're increasing supply of wafers, packaging, and memories. The increased supply is key because, due to their performance and programmability, server-class GPUs will remain the platform of choice for training and running large AI models."

Chipmakers scramble to meet the demand for AI

Global spending on AI-focused chips is expected to hit$53 billion this year - and to more than double over the next four years, according to Gartner Research. So it's no surprise that chipmakers are rolling out new processors as quickly as they can.

Intel has announced its plans for chips aimed at powering AI functions with its Gaudi 3 processors, and has said its Xeon 6 processors, which can run retrieval augmented generation (RAG) processes, will also be key. The Gaudi 3 GPU was purpose-built for training and running massive large language models (LLMs) that underpin genAI in data centers.

Meanwhile, AMD in its most recent earnings call, touted its MI300 GPU for AI data center workloads, which also has good market traction, according to IDC Group Vice President Mario Morales, adding that the research firm is tracking over 80 semiconductor vendors developing specialized chips for AI.

On the software side of the equation, LLM creators are also developing smaller models tailored for specific tasks; they require fewer processing resources and rely on local, proprietary data - unlike the massive, amorphous algorithms that boast hundreds of billions or even more than a trillion parameters.

Intel's strategy going forward is similar: it wants to enable genAI on every type of computing device, from laptops to smart phones. Intel's Xeon 6 processors will include some versions with onboard neural processing units (NPUs or "AI accelerators") for use in workstations, PCs and edge devices. Intel also claims its Xeon 6 processors will be good enough to run smaller, more customized LLMs.

Even so, without HBMs, those processors would likely struggle to keep up with genAI's high performance demands.

Précédent:Get an Apple iPad (9th or 10th Gen) for under $400 following Apple's 'Let Loose' event

Suivant:3+ reasons Apple might want to make its own server chips

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

serveurs

Nouvelles chaudes

Best 10Gb Switch for SMB in 2025: Unlock Next-Gen Network Performance

S5735-L48T4S-A: Complete Guide with Features, Specifications, and Benefits

S5735-L48P4X-A1: Reliable PoE+ CloudEngine Switch

S5735-L48LP4XE-A-V2: Scalable, Secure, and PoE-Ready for Demanding Enterprise Deployments

S5735-L48LP4S-A-V2 Powers Smarter Campus Networks with Advanced PoE and Cloud Management

S5735-L24T4X-A1 Empowers Installers with Scalable, Reliable, and Efficient Network Access

Best Ethernet Switches for Business (2025): Selection Guide and Top Picks

Huawei S5735-L24T4S-A1: A Compact, Stackable Access Switch Built for the Future

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

AI chip shortages continue, but there may be an end in sight

While GPUs are in high demand, they still need high-performance memory chips for AI apps. The market is tight for both - for now.

Why GPUs need high-bandwidth memory

Chipmakers scramble to meet the demand for AI

Tags chauds: Industrie de la technologie Ia générative CPUs et processeurs

Ordering Guide

Ressources ressources

À propos de nous

Huawei CloudEngine S5731‑S48P4X Datasheet