인텔이 AI 가속기에서의 연산 성능을 높이고 있다. 인텔 가우디2가 GPT-3 벤치마크에서 지난 결과값 대비 2배가량 높은 성적표를 받아 들며 거듭된 혁신의 결과를 증명했다.
▲Intel Gaudi 2 Accelerator (Photo: Intel)
Intel Releases Latest MLPerf Test Results
Intel is increasing the computational performance of its AI accelerators. Intel Gaudi 2 has proven the results of repeated innovation by receiving scores that are about twice as high as the previous results in the GPT-3 benchmark.
On the 10th, Intel announced at MLCommons that it had published the MLPerf Training v3.1 benchmark measurement results of the 4th generation Intel Xeon Scalable Processor with Intel Gaudi 2 Accelerator and Intel Advanced Matrix Extensions (Intel AMX).
Intel Gaudi2, the industry standard for AI model training, highlighted that it showed twice the performance by applying 8-bit floating point (FP8) data type in the v3.1 training GPT-3 benchmark. Through this benchmark submission, Intel further solidifies its promise to provide AI everywhere with competitive AI solutions.
Gaudi2 showed a 2x performance improvement on the v3.1 trained GPT-3 benchmark with the implementation of FP8 data types. The training time was cut by more than half compared to the June MLPerf benchmark, completing training in 153.38 minutes using 384 Intel Gaudi2 accelerators. In a presentation last June, a training time of 311 minutes was recorded for GPT-3 with the same number of accelerators.
Intel added that the Gaudi2 accelerator supports FP8 in both E5M2 and E4M3 formats, and also provides delay scaling options when needed. The latest MLCommons MLPerf results reportedly build on Intel’s improved AI performance over the MLPerf training results announced in June.
Additionally, Gaudi2 demonstrated training a Stable Diffusion multimodal model with 64 accelerators in 20.2 minutes using BF16. In future MLPerf training benchmarks, stable diffusion performance on FP8 data types will be submitted.
Benchmark results for BERT and ResNet-50 on eight Intel Gaudi2 accelerators showed 13.27 minutes and 15.92 minutes using BF16, respectively.
▲4th generation Intel Xeon Scalable Processor (Photo: Intel)
Intel is the only CPU manufacturer to submit MLPerf results, and its MLPerf results for the 4th generation Xeon highlight the powerful performance of its Xeon processors.
Results for RESNet50, RetinaNet, BERT and DLRM dcnv2 are presented, showing that the results for ResNet50, RetinaNet and BERT on 4th Gen Intel Xeon Scalable Processors are similar to the baseline performance results submitted to the MLPerf benchmarks in June 2023, with DLRM dcnv2 recording a training time of 227 minutes using only 4 nodes on a new CPU model submitted in June.
The performance of the 4th generation Xeon processors will enable many enterprises to economically and continuously train small and medium-sized deep learning models on existing enterprise IT infrastructure using general-purpose CPUs, especially for use cases where training is an intermittent workload.
Intel expects AI performance results to further improve in future MLPerf benchmarks through software updates and optimizations.
Sandra Rivera, senior vice president and general manager of Intel’s Data Center and AI Group, said, “Intel is committed to delivering AI port“We are continuously innovating with Folio and raising the bar on the MLCommons AI benchmarks with successive MLPerf performance results,” he said. “Intel Gaudi and 4th Gen Xeon processors provide customers with distinct price-performance advantages and are available immediately.”