반도체 AI 보안 인더스트리 4.0 SDV 스마트 IoT 컴퓨터 통신 특수 가스 소재 및 장비 유통 e4ds plus

NVIDIA Breaks New Ground in AI Data Center Productivity and ROI

기사입력2025.10.14 09:04


Blackwell's performance is proven in the InferenceMAX benchmark.

NVIDIA, a leader in AI computing technology, has transformed the AI inference market by demonstrating that AI data center performance can be converted into profit.

NVIDIA announced on the 13th that its next-generation AI platform, Blackwell, achieved the highest performance and efficiency in the newly announced InferenceMAX v1 benchmark.

This result is being evaluated as a case study that dramatically improved the productivity and return on investment (ROI) of AI data centers based on full-stack co-design of hardware and software.

A $5 million investment in an NVIDIA GB200 NVL72 system could generate $75 million in token revenue, representing a 15x ROI.

This demonstrates that AI inference is moving beyond mere technology and becoming a core infrastructure that creates real business value.

InferenceMAX v1 is a standalone benchmark published by SemiAnalysis that measures total compute costs based on real-world scenarios, demonstrating Blackwell's performance leadership.

In particular, through collaboration with open source-based models such as GPT-OSS 120B, Rama 3 70B, and DeepSearch R1, we are implementing optimal performance in large-scale inference environments.

TensorRT LLM v1.0 is available for 1.80 Blackwell B200 systems and NVLink switches.The throughput of the GPT-OSS model was dramatically improved by utilizing 0 GB/s bandwidth.

In particular, the gpt-oss-120b-Eagle3-v2 model, which introduced speculative decoding technology, achieved 100 TPS per user and increased the processing speed to up to 30,000 TPS per GPU.

Blackwell delivers over 10,000 TPS per GPU, 4x the throughput of the H200.

Maximizes the economics of AI deployment by achieving 10x higher throughput per megawatt and 15x lower cost per million tokens even in power-constrained environments.

InferenceMAX applies the Pareto Frontier approach to balance data center throughput, responsiveness, cost, and energy efficiency, ensuring the best ROI for real-world workloads.

This demonstrates the strength of Blackwell's full-stack design, which differentiates it from systems optimized for a single scenario.

Blackwell is based on an architecture comprised of the NVFP4 low-precision format, fifth-generation NVLink, and highly parallel processing algorithms, and is continuously improving performance through collaboration with open-source frameworks such as TensorRT-LLM, Dynamo, SGLang, and vLLM.