H3C DDC-based RoCE Switch Network vs. InfiniBand Network

64 GPU AI Computing Performance Comparison Test H3C DDC-based RoCE Switch Network vs. InfiniBand Network

Sponsor: New H3C Technologies Co., Ltd

All Reports Sponsored by this Vendor

Document Number: 224166

Publication Date: 12/13/2024

Page Count: 6

Abstract

Distributed Disaggregated Chassis (DDC) is an innovative network architecture design that breaks away from traditional centralized modular switch designs. It adopts a distributed and decoupled approach to enhance the flexibility and scalability of data center networks. DDC leverages advanced hardware technologies such as Virtual Output Queue (VOQ) and CELL switching to improve link usage and throughput between NCP and NCF. This fully meets the stringent requirements of low forwarding latency and low packet loss for transmission networks in High-Performance Computing (HPC) and AI applications.

 

The Tolly test evaluated the performance of the NVIDIA Collective Communication Library (NCCL) and the large language model (Llama3) across different network architectures in a 64-GPU environment.  Specifically, the test compared the performance differences between RDMA over Converged Ethernet (RoCE) and InfiniBand (IB) networks. Additionally, within the RoCE network, Tolly engineers tested the advantages of H3C's DDC technology over traditional Equal-Cost Multi-Path (ECMP) technology. 

 

The test results for NCCL and Llama3 demonstrate that DDC-based RoCE delivers performance and user experience comparable to IB in the same operational scenarios. 

 

The NCCL Alltoall test results show that DDC offers significant advantages in bus bandwidth (busbw) compared to traditional ECMP hash methods.

Login Sign-up
An unhandled error has occurred. Reload 🗙