Arm의 Neoverse V2 기반으로 만들어진 AWS의 그래비톤(Graviton) 4 프로세서가 이전 세대 프로세서보다 컴퓨팅 성능이 30% 더 빨라지고, 메모리 대역폭이 75% 더 넓어지는 등 뛰어난 성능 향상과 AI와 머신러닝 작업을 효과적으로 처리할 수 있는 최적화를 통해 클라우드 컴퓨팅 환경의 획기적 변화를 주도하고 있다.
▲AWS Graviton4 chip image
30% ↑ in computing performance, 75% ↑ in memory bandwidth
Effective processing of AI and machine learning, excellent performance of HPC
AWS's Graviton 4 processor, built on Arm's Neoverse V2, is driving a groundbreaking change in the cloud computing environment with significant performance improvements, including 30% faster computing performance and 75% greater memory bandwidth than previous-generation processors, as well as optimizations to effectively handle AI and machine learning tasks.
Arm announced on the 23rd that Graviton 4, unveiled at the recent AWS re:Invent 2024, is part of a long-standing collaboration between Arm and AWS to lay the foundation for a more efficient, sustainable, and powerful cloud.
The latest Arm Neoverse V2-based AWS Graviton 4 processor delivers up to 30% more compute performance, 50% more cores, and 75% more memory bandwidth than the previous generation Graviton 3.
The Arm Neoverse V2 platform includes new features from the Armv9 architecture, such as support for high-performance floating point and vector instructions, and features such as SVE/SVE2, Bfloat16, and Int8 MatMul that deliver powerful performance for AI/ML and HPC workloads.
To accelerate the adoption of AI workloads Earlier this year, Arm launched Arm Kleidi in collaboration with leading AI frameworks and software ecosystems to enable the entire ML stack to benefit from inference performance optimizations available out-of-the-box on Arm.
This enables developers to build workloads without requiring separate Arm-specific expertise.
For example, in PyTorch, these optimizations significantly improved tokens/sec and time-to-first-token metrics, enabling running LLMs such as Llama 3 70B and Llama 3.1 8B on AWS Graviton 4.
For HPC (high-performance computing) workloads, Graviton 4 delivers significant performance improvements over Graviton 3E, including 16% more main memory bandwidth per core and double the L2 cache per vCPU.
This is critical to the performance of HPC applications, which are primarily memory bandwidth constrained.
For EDA (electronic design automation) workloads, Arm’s engineering team has measured that Graviton 4 delivers up to 37% higher performance than Graviton 3 on RTL simulation workloads.
Over the past several years, adoption has continued to grow across the software ecosystem, with end customers deploying a variety of cloud workloads on AWS Graviton processors. Customers are reducing costs, experiencing improved performance, and improving their carbon footprint and sustainability goals.
In this way, AWS Graviton 4 accelerates innovation in cloud computing and contributes to building a more powerful and efficient cloud environment.