Maciej Besta and Marcel Schneider and Marek Konieczny and Karolina Cynk and Erik Henriksson and Salvatore Di Girolamo and Ankit Singla and Torsten Hoefler:
FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short
(In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20), Nov. 2020)
Abstract
We introduce FatPaths: a simple, generic, and
robust routing architecture that enables state-of-the-art low-diameter topologies such as Slim Fly to achieve unprecedented
performance. FatPaths targets Ethernet stacks in both HPC supercomputers as well as cloud data centers and clusters. FatPaths
exposes and exploits the rich (“fat”) diversity of both minimal
and non-minimal paths for high-performance multi-pathing.
Moreover, FatPaths uses a redesigned “purified” transport layer
that removes virtually all TCP performance issues (e.g., the
slow start), and incorporates flowlet switching, a technique used
to prevent packet reordering in TCP networks, to enable very
simple and effective load balancing. Our design enables recent
low-diameter topologies to outperform powerful Clos designs,
achieving 15% higher net throughput at 2× lower latency for
comparable cost. FatPaths will significantly accelerate Ethernet
clusters that form more than 50% of the Top500 list and it may
become a standard routing scheme for modern topologies.
Documents
download article: download slides:
Recorded talk (best effort)
BibTeX
@inproceedings{besta-fp, author={Maciej Besta and Marcel Schneider and Marek Konieczny and Karolina Cynk and Erik Henriksson and Salvatore Di Girolamo and Ankit Singla and Torsten Hoefler}, title={{FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short}}, year={2020}, month={Nov.}, booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20)}, source={http://www.unixer.de/~htor/publications/}, }