인피니언 Dec
반도체 AI 보안 인더스트리 4.0 SDV 스마트 IoT 컴퓨터 통신 특수 가스 소재 및 장비 유통 e4ds plus

[혁신포커스] AMD vs Nvidia, Clash in AI GPU...MI300X vs H100

기사입력2024.01.18 10:56

[This article was pre-released on e4ds+ on December 19, 2023 at 15:29.]


▲AMD MI300X and Nvidia H100
Nvidia: “I’m twice as fast” vs AMD: “No”
H100 to produce 27 trillion won worth of chips by the end of the year...AMD secures AI server chip market
Nvidia is already preparing the next-generation H200, AI chip competition heats up

The ultra-high-performance GPU market, which accelerates servers and AI operations, including generative AI and ultra-large language models, is heating up. Since AMD released its new MI300X GPU on the 7th to compete with Nvidia, which monopolized the market, a war of nerves has broken out between AMD and Nvidia.

■ Nvidia refutes claim that “H100 is twice as fast”

AMD unveiled the Instinct MI300 series at its 'Advancing AI' event on the 7th. At the time, AMD claimed in its product announcement that the MI300X offers up to 20% faster performance than the H100 on a single GPU, and up to 60% faster performance when compared in server products with eight GPUs.

NVIDIA responded quickly with a rebuttal. On the 14th, NVIDIA claimed that NVIDIA TensorRT-LLM, an open source that includes the latest kernel optimizations for NVIDIA Hopper architecture, has been released, and that these optimizations should make the Rama 2 70B model's inference performance twice as fast as that of the AMD MI300X.

This is mainly because AMD's previous announcement was a result of not using the latest optimization software for the H100.

They also explained that using one batch size, that is, one inference request at a time, on the DGX H100, a single inference can be processed in 1.7 seconds, and with a fixed 2.5-second response time, eight DGX H100 servers can process more than five Rama 2 70B inferences per second.

■ AMD, “Performance advantage increased by 2.1 times with latest optimization” rebuttal


▲Comparison table of H100 and MI300X inference performance on Rama 2 70B tested by AMD (Source: AMD Blog)


Two days later, on the 16th, AMD again released a rebuttal press release. AMD has released new benchmark test results that echo Nvidia's claims.

AMD has released new tests using TensorRT-LLM on the NVIDIA H100, comparing the performance of the MI300X GPU on FP16 data types to the H100 on FP8 data types. We also converted performance data from relative latency metrics to absolute throughput.

Reflecting these criteria, AMD has released test results that reflect the latest performance improvements made to the updated ROCM software over the November release performance recorded.

Ultimately, AMD's test results showed that the MI300X achieved a 2.1x performance advantage in vLLM comparisons over the H100, thanks to the updated performance values, and also claimed that the latency was improved by 1.3x in comparisons between the H100 TensorRT-LLM and MI300X vLLM.

In a single batch size, the Rama2 70B inference time was reported to be 1.7 seconds for the H100 and 1.6 seconds for the MI300X, providing a performance advantage.

AMD has directly addressed the controversy over performance verification by comparing it using 2,048 input tokens and 128 output tokens, the same conditions as when NVIDIA previously compared single-inference processing performance in a single batch size.

■ NVIDIA H100, about 550,000 units to be produced within the year...total market value of 27 trillion won!


▲NVIDIA H100 (Image: NVIDIA)


Currently, NVIDIA's H100 is a high-end, high-performance AI server chip that was released in March 2022 and has a market price of between 40 million won and 60 million won, with 550,000 units produced this year alone.

The total amount of products produced this year is estimated to be 27.5 trillion won, which is large enough to be calculated based on the median market price of 50 million won., a scale that allows us to guess that it is forming a huge market.

Nvidia's sales in the third quarter of this year amounted to $18.1 billion, or 23 trillion won in Korean currency.

■ AI chip market booming... Likely to take some of AMD's market share

AMD has thrown its hat into the ring in the battle for market share as a countermeasure to Nvidia, which dominates the server chip market with a 90% monopoly.

It is reported that Microsoft, Meta, and others are interested in the MI300 AI GPU series and are considering introducing it.

Investment experts are also raising their target stock prices for AMD and predicting that AMD's server GPU sales will increase sharply in 2024.

AMD is expected to inevitably benefit from the AI server chip industry's demand, where demand exceeds supply, and AMD products are expected to successfully settle in the market as a substitute to meet the shortage of AI server chips.

The industry is seeing a shortage of NVIDIA's AI server chips, such as the H100, even at a premium price, so there is a prevailing view that AMD will be the only choice for rapid AI server adoption.

■ MI300X·A to be released in 2024...Nvidia announces H200 release

While AMD MI300X and MI300A are expected to be available in the market in the second quarter of 2024, NVIDIA's next-generation latest AI GPU, the H200, is also expected to be available in the second quarter of 2024 with a performance that is twice as fast as the H100. AMD's latest GPU is equipped with HBM3, but Nvidia announced that the H200 will be equipped with the 5th generation HBM3E for the first time, which is expected to intensify the competition in AI semiconductor performance.

As interesting as the CPU battle between AMD and Intel is, the fierce competition between AMD and Nvidia, the GPU rivals, is drawing attention in the AI server market.