Nvidia BlueField

Nvidia BlueField is a line of data processing units (DPUs) designed and produced by Nvidia. Initially developed by Mellanox Technologies, the BlueField IP was acquired by Nvidia in March 2019, when Nvidia acquired Mellanox Technologies for US$6.9 billion.^[1] The first Nvidia produced BlueField cards, named BlueField-2, were shipped for review shortly after their announcement at VMworld 2019, and were officially launched at GTC 2020.^[2] Also launched at GTC 2020 was the Nvidia BlueField-2X, an Nvidia BlueField card with an Ampere generation graphics processing unit (GPU) integrated onto the same card.^[2] BlueField-3 and BlueField-4 DPUs were first announced at GTC 2021, with the tentative launch dates for these cards being 2022 and 2024 respectively.^[3]

Nvidia BlueField cards are targeted for use in datacenters and high performance computing, where latency and bandwidth are important for efficient computation.^[4]

BlueField cards differ from network interface controllers in their offloading of functions that would normally be reserved for the CPU, and the presence of CPU cores (typically ARM or MIPS based) and memory support (typically DDR4, though Bluefield-3's release brought support for more exotic memory types such as HBM and DDR5). BlueField cards also run an operating system completely independent from the host system: this is designed to reduce software overhead, as each DPU can function independently of one another and the head unit.^[5] This also means that Bluefield cards are capable of allowing remote management of systems that may not typically support it. Bluefield cards can also configure their PCIe bus to function as a host, rather than a device, which lets Bluefield cards connect over a PCIe bridge to another card, such as a compute accelerator, to provide completely network-based, high bandwidth control of a GPU.^[6]

The Bluefield X cards are DPU-GPU hybrid cards with a 100 class Nvidia datacenter GPU integrated on the same PCB as the Bluefield DPU. These cards are intended for high power GPU clusters to allow high bandwidth communication without needing to cross the PCIe bus and create an unnecessary load on the CPU where performance may be better allocated to other types of processing. The increase in total external connectivity available to a system in this configuration allows for datasets to be utilized across multiple nodes when they may be too large for any single system to hold in memory.

Models


Model	Announcement Date	Release Date	Networking Port Options	Bandwidth Capacity	Cores	Core Type	PCIe Generation	Memory Capacity	Memory Type	GPU Accelerator	SPECint(2k17-rate)^[7]	TOPS^[7]
BlueField-2	October 5, 2020	Q2 2021	Dual QSFP56 10/25/50/100 Gb Single QSFP56 200 Gb	200Gbit/s	8	ARM A72	4.0	16/32 GB	DDR4	N/A	9	0.7
BlueField-2X	October 5, 2020	Q4 2021	Dual QSFP56 10/25/50/100 Gb Single QSFP56 200 Gb	200Gbit/s	8	ARM A72	4.0	16/32 GB	DDR4	Nvidia A100	9	60
BlueField-3	April 12, 2021	Q1 2022	Quad/Dual/Single QSFP56	400Gbit/s	16	ARM A78	5.0	64 GB	DDR5	N/A	42	1.5
BlueField-3X	April 12, 2021	N/A	Quad/Dual/Single QSFP56	400Gbit/s	16	ARM A78	5.0	64 GB	DDR5	Nvidia A100	42	75
BlueField-4	2024	OSFP112	800Gbit/s	TBD					TBD	160	400

H100 CNX & A100 EGX

The H100 CNX and the A100 EGX are NIC/GPU hybrid cards and, while visually similar to a Bluefield-X card, are completely distinct, and do not have the Bluefield system on a chip integration. The cards are instead equipped with a generic ConnectX network interface controller.^[8]^[9]

References

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]