▲NVIDIA HGX H200 (Photo: NVIDIA)
HGX H200 System and Cloud Instances Coming Soon
NVIDIA, which is raising its value as we enter the era of generative AI, has introduced next-generation GPUs for data centers and AI computing.
NVIDIA, a leader in AI computing technology, announced on the 14th that it is launching the NVIDIA HGX H200.
The platform is said to feature the NVIDIA H200 Tensor Core GPU with advanced memory based on the NVIDIA Hopper™ architecture and is capable of processing massive amounts of data for generative AI and high-performance computing workloads.
The NVIDIA H200 is the first GPU to offer HBM3e, which accelerates generative AI and large language models with faster, larger memory, while advancing scientific computing for HPC workloads. The NVIDIA H200 delivers 141GB of memory at 4.8 terabytes per second via HBM3e, nearly double the capacity and 2.4x the bandwidth of its predecessor, the NVIDIA A100.
It is known that SK Hynix, one of the two largest domestic memory manufacturers, is leading in this technology and exclusively supplies HBM3e to Nvidia.
H200-based systems from leading global server manufacturers and cloud service providers are expected to be available in Q2 2024.
“Creating intelligence from generative AI and HPC applications requires processing massive amounts of data quickly and efficiently, with massive amounts of fast GPU memory,” said Ian Buck, vice president of Hyperscale and HPC at NVIDIA. “With NVIDIA H200, the industry’s most advanced end-to-end AI supercomputing platform, we’re accelerating our journey to solve the world’s most important challenges.”
The NVIDIA H200 is available on the NVIDIA HGX H200 server board in four-way and eight-way configurations and is fully hardware and software compatible with the HGX H200 system, including the NVIDIA GH200 Grace Hopper™ Superchip with HBM3e announced in August.
Additionally, NVIDIA’s global ecosystem of partner server manufacturers can update existing systems to the H200. Partners include ASRock Rack, ASUS, Dell Technologies, Eviden, GIGABYTE, Hewlett Packard Enterprise, Ingrasys, Lenovo, QCT, Supermicro, Wistron and Wiwynn.
Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure will be among the first cloud service providers to deploy H200-based instances starting next year, following CoreWeave, Lambda, and Vultr.
The HGX H200 is based on NVIDIA NVLink™ and NVSwtich™ high-speed interconnects. It delivers extreme performance across a wide range of application workloads, including LLM training and inference on large models with over 175 billion parameters. Powered by eight arrays, the HGX H200 delivers over 32 petaflops of FP8 deep learning compute and a total of 1.1TB of high-bandwidth memory, delivering extreme performance for generative AI and HPC applications.
Meanwhile, the NVIDIA accelerated computing platform is supported by powerful software tools that enable developers and enterprises to build and accelerate production-ready applications, from AI to HPC. This includes the NVIDIA AI Enterprise software suite for workloads such as voice, recommendation systems, and hyperscale inference.
NVIDIA H200 will be available from global system manufacturers and cloud service providers starting in the second quarter of 2024, the company said.