Nvidia BlueField

Nvidia BlueField is a line of data processing units (DPUs) designed and produced by Nvidia. Initially developed by Mellanox Technologies, the BlueField IP was acquired by Nvidia in March 2019, when Nvidia acquired Mellanox Technologies for US$6.9 billion.[1] The first Nvidia produced BlueField cards, named BlueField-2, were shipped for review shortly after their announcement at VMworld 2019, and were officially launched at GTC 2020.[2] Also launched at GTC 2020 was the Nvidia BlueField-2X, an Nvidia BlueField card with an Ampere generation graphics processing unit (GPU) integrated onto the same card.[2] BlueField-3 and BlueField-4 DPUs were first announced at GTC 2021, with the tentative launch dates for these cards being 2022 and 2024 respectively.[3]

Nvidia BlueField cards are targeted for use in datacenters and high performance computing, where latency and bandwidth are important for efficient computation.[4]

BlueField cards differ from network interface controllers in their offloading of functions that would normally be reserved for the CPU, and the presence of CPU cores (typically ARM or MIPS based) and memory support (typically DDR4, though Bluefield-3's release brought support for more exotic memory types such as HBM and DDR5). BlueField cards also run an operating system completely independent from the host system: this is designed to reduce software overhead, as each DPU can function independently of one another and the head unit.[5] This also means that Bluefield cards are capable of allowing remote management of systems that may not typically support it. Bluefield cards can also configure their PCIe bus to function as a host, rather than a device, which lets Bluefield cards connect over a PCIe bridge to another card, such as a compute accelerator, to provide completely network-based, high bandwidth control of a GPU.[6]

The Bluefield X cards are DPU-GPU hybrid cards with a 100 class Nvidia datacenter GPU integrated on the same PCB as the Bluefield DPU. These cards are intended for high power GPU clusters to allow high bandwidth communication without needing to cross the PCIe bus and create an unnecessary load on the CPU where performance may be better allocated to other types of processing. The increase in total external connectivity available to a system in this configuration allows for datasets to be utilized across multiple nodes when they may be too large for any single system to hold in memory.

Models

ModelAnnouncement DateRelease DateNetworking Port OptionsBandwidth CapacityCoresCore TypePCIe GenerationMemory CapacityMemory TypeGPU AcceleratorSPECint(2k17-rate)[7]TOPS[7]
BlueField-2October 5, 2020Q2 2021Dual QSFP56 10/25/50/100 Gb

Single QSFP56 200 Gb

200Gbit/s8ARM A724.016/32 GBDDR4N/A90.7
BlueField-2XQ4 2021Nvidia A10060
BlueField-3April 12, 2021Q1 2022Quad/Dual/Single QSFP56400Gbit/s16ARM A785.064 GBDDR5N/A421.5
BlueField-3XN/ANvidia A10075
BlueField-42024OSFP112800Gbit/sTBDTBD160400

H100 CNX & A100 EGX

The H100 CNX and the A100 EGX are NIC/GPU hybrid cards and, while visually similar to a Bluefield-X card, are completely distinct, and do not have the Bluefield system on a chip integration. The cards are instead equipped with a generic ConnectX network interface controller.[8][9]

References