Each host carries a Host Channel Adapter (HCA) โ an intelligent NIC that implements the entire protocol stack in hardware. The application uses the verbs API (ibv_post_send) to post an RDMA WRITE/READ/SEND or an atomic operation; the HCA directly reads/writes remote memory with zero copies and no CPU involvement on the remote side. The switched fabric uses a Subnet Manager to compute paths (linear forwarding tables) and credit-based flow control: a sender transmits only when the receiver has buffer credit available, guaranteeing losslessness. Physical layer: links are aggregated (1ร/4ร/8ร/12ร) with QSFP (up to HDR) and OSFP (NDR and beyond) connectors, copper up to 10 m, fiber up to 10 km.
Traditional Ethernet-with-TCP/IP introduced high latency, CPU overhead, and lossy behavior that disqualified it as an HPC/AI interconnect. InfiniBand solves this with native RDMA, lossless link-level flow control, and a switched-fabric topology from layer 1 up.
Host-side adapter that implements the IB transport stack in hardware and serves the RDMA verbs (send, receive, write, read, atomic).
Fabric switch that forwards IB packets between HCAs based on the linear forwarding table installed by the Subnet Manager.
Control-plane component (typically run on one of the nodes or in a switch) that discovers topology, assigns LIDs, and programs routing tables in switches.
Official
IBTA-standardized set of programming operations (ibv_post_send, ibv_open_device, ibv_reg_mrโฆ) implemented by the libibverbs library (OFED).
After the Mellanox acquisition (2019) and Intel's exit (Omni-Path), NVIDIA is effectively the sole IB hardware vendor.
Master SM failure blocks new path setup; a standby SM must be configured.
IB is a dedicated fabric โ IPoIB or a gateway is required to interoperate with the broader IP infrastructure.
IB hardware (HCAs, switches, cabling) is typically more expensive than equivalent Ethernet at the same line rate.
NGIO (Intel) and Future I/O (Compaq, IBM, HP) merge into the InfiniBand Trade Association.
First release of the IB architecture specification.
Mellanox ships the first commercial InfiniBand products at 10 Gbit/s line rate (SDR).
OpenIB Alliance (later OpenFabrics) integrates the IB stack into the mainline kernel.
After years of HPC ecosystem growth, InfiniBand becomes the dominant interconnect on the TOP500 list.
The acquisition makes IB a strategic component of NVIDIA's AI platform โ the Quantum (switches) and ConnectX (HCA) lines.
Introduction of NDR (Quantum-2, ConnectX-7) โ the scale-out fabric of frontier-class AI clusters.
NVIDIA announces Quantum-X800 and ConnectX-8 as the next-gen fabric for Blackwell GPUs.
A switched fabric with multi-rail HCAs and adaptive routing enables parallel communication between thousands of GPUs without a single-link bottleneck.
Per-lane bandwidth generation โ from 2.5 Gbit/s (SDR) up to 200 Gbit/s (XDR).
Number of aggregated physical lanes per port. 4ร is the standard; 12ร is used switch-to-switch.
Fat tree, dragonfly, torus โ affects bisection bandwidth, cost, and diameter.
IB packet size โ typically 256 B to 4 KB (max).
IB is the primary scale-out fabric of the NVIDIA DGX/SuperPOD platforms for H100/H200/B200 GPU clusters.