Physical AI HBM Smart Factory SDV AIoT Power Semicon 특수 가스 정정·반론보도 모음 e4ds plus

Can 256GB of ultra-large data processing in the AI era become a reality with FPGA accelerators?

기사입력2020.03.17 16:53

Realizing IO optimization through three-stage harmony of CPU virtualization, FPGA acceleration, and 256GB bandwidth
Intel Minimizes Latency in Gen4 and Gen5 Environments with FPGA-Based Accelerator Cards


"As the server market grows in size, the technology to process large amounts of data quickly and accurately is becoming increasingly important. Beyond developing FPGAs based on our CPU design know-how, we are striving to optimize IO to meet PCI Express standards."

As we enter the era of 5G hyper-connectivity, the demand for data centers that can provide various convergence services continues to increase. Investment in data center servers is gradually increasing to meet network requirements such as cloud, enterprise, and edge resources.

FPGAs have been evaluated as a groundbreaking technology with fast calculation power that replaces graphic cards, but it has only been about 1-2 years since they began to be recognized in data center servers. Programmable chips that allow users to create dedicated accelerators according to their desired environment, FPGAs have greatly improved performance through parallel processing in specific financial solutions including machine learning, image processing, and genetic research.

In particular, FPGA-based accelerators are attracting attention in the data market due to the recent strength of the mobile market. In fact, hyperscale data center operators such as Google, Microsoft, and Amazon, as well as domestic companies such as Naver, Kakao, SKT, and KT, are spending a lot of money on FPGA-based acceleration.

GPUs have limited application areas such as machine learning, and ASICs require a lot of investment, whereas FPGAs can be used in a wide range of applications because programs can be used directly on the hardware architecture and developers can implement the environment they want.

▲ HPE ProLiant DL380 Gen10 server and Intel FPGA PAC D5005 <Photo = Intel>

Representative companies that provide accelerator cards utilizing FPGA chips are Xilinx and Intel. Xilinx announced the Alveo U250 in 2018 and began mass production in 2019. Intel unveiled the Intel FPGA PAC DL380 Gen10, an FPGA-based PCIe card-type hardware accelerator, in August of last year.

As future FPGA accelerator applications are expected to see an increase in demand for interfaces that can transmit and receive large amounts of data more quickly and efficiently, Intel will provide an accelerator card that can be used in PCIe Gen5. Interfaces that transmit and receive large amounts of data include Ethernet, transceivers, and PCIe.

Intel's 10nm process FPGA Agilex, unveiled last April, supports the next-generation interface CXL (Compute eXpress Link interfaces) standard and PCIe Gen5. When used with dedicated accelerators such as CPUs and GPUs, sufficient bandwidth can be secured, minimizing latency.

▲ Intel Korea Manager Nam-Hoon Lee

With the use of accelerators in data centers projected to reach $26 billion by 2022, we met with Intel Korea Vice President Nam-Hoon Lee to hear about key issues surrounding FPGA virtualization and accelerators.

Q. Since last year, as the use of FPGA accelerators has increased in PCs and IDC servers, the term accelerator has also become more common. Beyond the role of existing GPUs being replaced by FPGAs, demand for AI accelerators has also been increasing recently. What competitive edge do FPGA accelerators have over graphics cards?

FPGA's competitive edge lies in its fast computational power. It also supports high bandwidth, making it an ideal solution for applications that require low latency.

FPGA accelerators are required to be able to transmit and receive large amounts of data. Accordingly, PCIe is also gradually developing. Intel is providing an accelerator card to increase the speed between FPGA and PCIe.

The Programmable Acceleration Card (PAC) card, which plugs into the server in card form, helps to process tasks that take tens of seconds on a CPU in a matter of seconds using an FPGA. For example, if a user wants to implement AI, a CPU may not be able to handle a certain performance, but an FPGA can easily implement it.

However, to provide the high processing speed of FPGA to CPU, the bandwidth needs to be increased. This is why the latency and bandwidth mentioned above are important.


Q. PCI Sig announced that they will complete the development of PCIe Gen6 by 2021 and launch it on the market in 2022. Currently, I understand that PCIe Gen4 has a bandwidth of 64 GB/s and a bit rate of 16 GT/s. Can you explain why the bandwidth expansion is necessary?

Let's take smartphones as an example, which have contributed to activating the cloud and AI. Let's say a user takes a picture and uploads it on their smartphone.

Data transmitted from smartphones is transmitted to servers via Ethernet. Servers most commonly use the PCI interface to process this data.

Passing data directly to the CPU through the PCI interface not only increases memory requirements but also increases processing time. By compressing this using an FPGA and then storing it in the CPU, both capacity and time can be reduced.

Currently, PCIe Gen4 has a bandwidth of 64GB/s, and Gen5 has a bandwidth of 128GB/s. Looking at the Ethernet in data centers used to transmit data, it was 10GB/s in the past, is currently 25GB/s, and is expected to be implemented at 100~400GB/s in the future.

In order to quickly process 400GB/s of data and deliver it to users in real time, the bandwidth of PCIe must also be improved simultaneously.

Another example is stocks. Stocks must be delivered to consumers in real time, without delay, at the peak of their trading day.

This is why we need to utilize FPGAs, which have fast computational capabilities and low latency, and expand the bandwidth of PCIe to transmit them.

▲ PCIe bandwidth expansion is required for CPU and FPGA virtualization

Q. FPGAs have always been in the spotlight as chips that replace GPUs. These days, the GPU field is spurring AI development based on virtual environments such as VMWare. FPGAs, which perform high-performance digital signal processing, also seem to be indispensable in the virtualization sector.

Of course. Recently, as ICT technologies such as machine learning, deep learning, AI, and cloud have begun to attract the attention of not only enterprises but also consumers, another approach to acceleration has become necessary for data center servers.

Virtualization is a technology that is used to implement multiple applications simultaneously without latency to multiple users, quickly and accurately. The technology that supports this is the FPGA built into the Intel accelerator card.

Related technologies include SR-IOV, Scalable IOV, and VirtIO. PCIe bandwidth is also continuously expanding to increase data transmission and reception speeds between CPUs, FPGAs, and networks. The idea is to optimize design through FPGAs.


Q. The PCIe Gen5 and Gen6 era may arrive sooner than we expect. With data transfer speeds increasing exponentially, should product planners choose to migrate from Gen5 to Gen6?

For Gen6, it has a bandwidth of up to 256GB/s. For developers who need to implement PCIe on servers, we recommend migration considering the limited server interface.

However, if you are a general development planner, not a server, I would recommend using the transceiver that Intel currently provides. The transceiver is also continuously upgrading the standard separately from PCIe and providing it to users. Currently, in Gen4, you can use FPGAs produced in the 14nm process to meet the speed of 64GB/s.

If you want Gen5 speeds, Intel's Agilex products, which are manufactured on a 10nm process, will be supported, as will the upcoming 256GB/s Gen6.

As a user protocol, the transceiver also supports different bandwidths per PCIe line, and currently, up to 30 are supported per FPGA line, which will be sufficient to implement the environment desired by the user.

▲ 10nm Agilex <Image = Intel>

Q. FPGA accelerators appear to have potential not only in the server market but also in various fields such as AI, 5G, and autonomous driving. In what areas do you expect high-performance PCIe such as Gen5 to be primarily applied?

I believe that it has utility in most areas of the 4th industrial revolution ICT, including AI servers, edge networks, and 5G relay networks for intelligent transportation systems connected to 5G networks.

The PAC Card provided by Intel is a board-type product with an FPGA installed on the card and has specifications that comply with the PCIe interface.

After drawing the workload desired by the user on the FPGA and putting it on a PAC card and plugging it into the server, it is possible to implement a compression accelerator, an AI accelerator, and an image conversion accelerator. The role of the accelerator is determined by the function loaded on one card.

The most important thing is to accurately determine which part the user wants to accelerate. Then, by implementing it on an FPGA and linking it with a server to finalize, you can create the desired environment faster than with existing CPUs.

Taking self-driving cars as an example, the visual information connected to the vehicle can be viewed as a type of AI. Vehicles are connected to numerous sensors, and data is transmitted and received through each sensor.

Image judgment and analysis can be processed faster by using FPGAs instead of CPUs.

▲ Companies are using accelerators in their data centers to respond to the hyper-connected era.


Intel acquired FPGA manufacturer Altera for $16 billion (KRW 19.36 trillion) in 2015, and then in April of last year, acquired OmniTek, an FPGA company specializing in high-performance computer vision and AI inference technology, in an effort to expand its market share.

With the explosion of AI and big data markets and the decline of Moore’s Law, the semiconductor market is now at a critical inflection point. As the traditional silicon design cycle can no longer keep up with the pace of innovation, technologies that accelerate applications using hardware and software are gaining attention.

The accelerator market has entered a full-blown second war between GPUs that survived the GPU and coprocessor competition and accelerators based on FPGAs and ASICs.
최인영 기자
기사 전체보기

초고속 AI 구현의 필수조건 차세대 PCIe, 그것이 궁금하다
2020-04-28 10:30~12:05
Intel / 이남훈 이사