

# ACCELERATING DATACENTER WITH FPGA

### Agenda

- Market Trends : Data Explosion, Software Defined X
- What is FPGA? Why FPGA for Data Center?
- Use Examples
- Acceleration Stack for Intel<sup>®</sup> Xeon<sup>®</sup> CPU with FPGAs



# Market Trends : Data Explosion

#### **BY 2020**





All numbers are approximated

- All infinibers of upp owningers http://www.cisco.com/c/en/us/solutions/service-provider/vni-network-traffic-forecast/infographic.html http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud\_Index\_White\_Paper.html https://dutoflog.com/read/self-artiving-cars-create-2-petbytes-data-annually/172 http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud\_Index\_White\_Paper.html http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud\_Index\_White\_Paper.html http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud\_Index\_White\_Paper.html

http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud-Index-White-Paper.html

# Market Trends : Data Explosion



#### **Market Examples**

- Financial
- Genomics
- Government
- Enterprise
- Cloud

#### Infrastructure

- Network
- Storage
- Compute



- Security
- Transcode
- Video processing and analytics
- Artificial Intelligence
- Packet processing



| F | E | E |
|---|---|---|
| : | : | : |
|   |   |   |



<sup>‡</sup>Intel Estimate





# Market Trends : Software Defined X

#### **Dedicated Infrastructures**

Physical Appliances Network/Dedicated Servers



#### **Flexible Cloud Infrastructure** Commercial Off the Shelf (COTS) Servers



#### Uniform Scalable Programmable(**Reconfigurable**)

### **FPGA** Overview

- Field Programmable Gate Array (FPGA)
  - Millions of logic elements
  - Thousands of embedded memory blocks
  - Thousands of DSP blocks
  - Programmable routing
  - High speed transceivers
  - Various built-in hardened IP
- Used to create Custom Hardware!



## Advantages of Custom Hardware with FPGAs





### **FPGA Custom Hardware**

Custom Datapath on the FPGA Matches Your Algorithm!

- Creates typically very deeply pipelined version of a kernel
  - Huge number of operations simultaneously inflight
- Data can more easily be localized on chip



#### Build exactly what you need: Operations Data widths Memory size & configuration

#### Efficiency: Throughput / Latency / Power

# **FPGA Enabled Performance and Agility**

Workload Optimization: ensure Xeon cores serve their highest value processing

Efficient Performance: improve performance/watt

intel

ARRIA

inside

**Real-Time:** high bandwidth connectivity and low-latency parallel processing

**Developer Advantage:** code re-use across Intel FPGA data center products



The Intel® Xeon® processor with FPGA acceleration can reduce TCO and solve new problems

### Acceleration types in a Data Center

- Application Acceleration : Part of the application domain
  - Artificial Intelligence, Video Transcoding, HPC, ...
- Infrastructure Acceleration : Part of the data center infrastructure
  - Virtual Switching, Software defined Networking, compression, cryptography, packet processing, ...

10

### Use Examples

Microsoft •



- GATK
- **SQL** Acceleration •
- Public Cloud Service
  - AWS F1 Instance
  - Alibaba
  - NIMBIX





### Why FPGAs As Accelerators?



#### **FPGAs Maximize ROI**

One Architecture <u>efficiently</u> implements many workloads Application flexibility

Reconfiguration in µs Power efficient



#### Modern Data Center Facts

3-5 year Life Cycle
High CAPEX and OPEX
Requirement to support rapid scale-out
Flexibility to adapt to rapidly changing workloads

#### **Alternative Accelerators**

|                                                        | GPU    | Network Processor | FPGA          |  |
|--------------------------------------------------------|--------|-------------------|---------------|--|
| Throughput                                             | ✓ High | ✓ High            | <b>√</b> High |  |
| Latency                                                | High   | ✓ Low             | ✓ Low         |  |
| Power                                                  | High   | ✓ Low             | ✓ Low         |  |
| General computing                                      | ✓ Yes  | No                | ✓ Yes         |  |
| FPGA provides optimal networking & compute combination |        |                   |               |  |



# Why Intel<sup>®</sup> FPGAs in the Data Center?



FPGA Acceleration Enabled with the Extended Intel® Architecture

#### Acceleration Stack for Intel® Xeon® CPU with FPGAs



#### Intel® delivers a system-optimized solution stack for your data center workloads

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos Some names pending final approv**ahted@confidentialn-tFefUNDA** Discussions Only Logos and names provided for illustrative purposes only. Current availability may be different.

# Intel® Xeon® with FPGA Virtualization Framework



Simplifies the use of FPGAs in virtualized cloud environments



### Interfacing with the Software Stack



#### **FPGA** Components





**inte** 

### **Acceleration Environment**

Common Developer Interface for Intel® FPGA Data Center Products



<sup>1</sup>OPAE = Open Programmable Acceleration Engine <sup>2</sup>UPI = Intel<sup>®</sup> Ultra Path In**tertedr@enfidential** – For NDA Discussions Only <sup>3</sup>HSSI = High Speed Serial Interface



# **Open Programmable Acceleration Engine (OPAE)**

#### Consistent API across product generations and platforms

• Abstraction for hardware specific FPGA resource details

#### Designed for minimal software overhead and latency

Lightweight user-space library (libfpga)

#### Open ecosystem for industry and developer community

License: FPGA API (BSD), FPGA driver (GPLv2)

#### FPGA driver being upstreamed into Linux kernel

Supports both virtual machines and bare metal platforms

Faster development and debugging of Accelerator Functions with the included AFU Simulation Environment (ASE)

Includes guides, command-line utilities and sample code

Simplified FPGA Programming Model for Application Developers



Start developing for Intel FPGAs with OPAE today: <a href="http://01.org/OPAE">http://01.org/OPAE</a>



### **FPGA Driver Architecture**



#### FME: FPGA Management Engine Driver

- Static circuits for power/thermal management, reconfiguration, debugging, error reporting, performance counters, etc.

#### AFU: Accelerator Function Unit Driver

- Reconfigurable circuits for application specific functions.
- Exposes a 256KB region as control registers.
- User process can share memory buffers with AFU.

#### Port:

- Interface between the static and the reconfigurable regions
- Each port can attach an AFU. There may be multiple ports.
- A port can be assigned to a VM and expose the AFU.



### OPAE FPGA API – Enumerate, Manage & Access



**inte** 

### How can FPGA accelerators be **created**?

#### Self-Developed

#### **Externally-Sourced**



### Two Development Approaches

#### **HDL Programming OpenCL\*** Programming Intel FPGA SDK ASE **OPAE** Quartus<sup>®</sup> Prime<sup>®</sup> for OpenCL<sup>\*\*</sup> Design Software from Intel® from Intel OpenCL OpenCL OpenCL С HDL Kernels Host Syn. SW OpenCL SW PAR Compiler Compiler Compiler AFU exe Bitstream exe Bitstream **AFU** Application AFU Application AFU **OpenCL** Simulation **OPAE** OPAE FIM + FIM Emulator Environment Software **Software OpenCL BSP** (ASE) CPU **FPGA** CPU **FPGA** ......

AFU

# OpenCL<sup>™</sup> Flow

- Usage no different from traditional OpenCL<sup>™</sup> flow
  - C based development and optimization flow to create AFUs and Host Application
  - Standard OpenCL FPGA application using the Intel® FPGA SDK for OpenCL
    - FPGA OpenCL debug and profiling tools supported
  - More information on using <u>OpenCL with FPGAs</u>
- The Acceleration Stack abstracted away from user
  - OPAE part of the Host Run-Time
    - Host does not need to interact with OPAE SW directly
  - OpenCL BSP part of the FPGA Interface Manager To learn more about using OpenCL with FPGAs, visit Intel FPGA Customer Training page



# **RTL AFU**



- Develop RTL AFU with standard FPGA development tools
- Interface with the acceleration stack through Core Cache Interconnect (CCI-P)
  - Provides a base platform memory interface
    - Simple request/response interface (Simple Read/Write)
    - Physical addresses
    - No order guarantees
  - These minimal requirements satisfy major classes of algorithms, e.g.:
    - Double buffered kernels that read from and write to different buffers
    - Streaming kernels that read from one memory-mapped FIFO and write to another



25



- AFU Simulation Environment (ASE) enables seamless portability to real HW
  - Allows fast verification of OPAE software together with AFU RTL without HW
    - SW Application loads ASE library and connects to RTL simulation
  - For execution on HW, application loads Runtime library and RTL is compiled by Intel<sup>®</sup> Quartus into FPGA bitstream



# Intel<sup>®</sup> Programmable Acceleration Card with Intel Arria<sup>®</sup> 10 GX FPGA

#### Intel's 1st versatile FPGA PCIe acceleration card that offers inline & look-aside acceleration for workloads requiring up to 45W



1<sup>st</sup> acceleration card to offer the Acceleration Stack for Intel Xeon CPU with FPGAs enabling broader FPGA adoption in data center

Intel Programmable Acceleration Card with Intel Arria 10 GX FPGA



# Summary

- Acceleration Stack for Intel<sup>®</sup> Xeon<sup>®</sup> CPU with FPGAs
  - Robust collection of software, firmware, and tools
  - Makes it easy to develop and deploy Intel FPGAs in the data center
  - Supports both RTL and OpenCL<sup>™</sup> development flows
  - Intel FPGA Acceleration Hub
- Follow-on trainings
  - RTL Development and Acceleration with the Acceleration Stack for Intel<sup>®</sup> Xeon<sup>®</sup> CPU and FPGAs
  - Intel FPGA OpenCL Trainings
- References
  - Various quick start and development guides associated with the Acceleration Stack



28





