Interconnect

Help Desk

Aurora

Slingshot

Cray Slingshot is the next generation fabric technology from Cray/HPE with advanced features such as sophisticated congestion management and network Quality of Service (QoS). Aurora will use Slingshot fabric connected in a Dragonfly topology with 8 fabric endpoints per node.

Slingshot uses a 64-port switch (called Rosetta) with 12.8 Tb/s/dir per switch coming from 64 200 Gbps ports. The Dragonfly topology used enables a system of exascale capacity with a diameter of just three network hops. The advanced congestion management enables low latency even under load by avoiding congestion and queueing in the network. Also, the Slingshot’s low network diameter allows faster responsive adaptive routing; each switch can potentially have a good view of the overall state of the network, so can make fast, well-informed decisions about optimal paths to take to avoid temporary congestion. With the combination of adaptive routing and congestion control, Slingshot can provide highly effective performance isolation between workloads.Early evaluation of Slingshot’s congestion management for adversarial traffic patterns with the GPCNeT benchmarks were encouraging in demonstrating Slingshot’s ability to optimize network traffic  [1].

[1] GPCNeT: designing a benchmark suite for inducing and measuring contention in HPC networks (https://dl.acm.org/doi/10.1145/3295500.3356215)

Some relevant references on Slingshot

https://www.cray.com/sites/default/files/Slingshot-The-Interconnect-for-the-Exascale-Era.pdf

https://www.cray.com/blog/meet-slingshot-an-innovative-interconnect-for-the-next-generation-of-supercomputers/