nVIDIA Fabric Manager, nVSwitch, CUDA, and NVLink

Quick Overview and Comparison

Note this is simply a quick overview and comparison, and could be not so precise or close to facts. For more details, please refer to the following sections.

What is nVIDIA Fabric Manager?

It's a manager for managing nVSwitch and NVLink. It's a daemon running in the background. It's a part of nVIDIA driver optionally. Only when your hardware supports nVSwitch, installing fabricmanager would be useful.

Quick Overview of NVSwitch and NVLink

[1] Note that DGX-1 has no NVSwitch, but DGX-1 supports NVIDIA GPUDirect Remote Direct Memory Access (RDMA) already. That is to say, RDMA is not on top of NVSwitch. See White Paper: NVIDIA DGX-1 With Tesla V100 System Architecture; The Fastest Platform for Deep Learning

Difference between nVSwitch and NVLink

A rough overall picture is that: - NVSwitch is a hardware switch for connecting GPUs. It's a hardware component and managed by fabric manager. - NVLink is a hardware interconnect for connecting GPUs. It's a hardware component and driven by CUDA.

NVSwitch is a switch architecture designed to further enhance the capabilities of NVLink.