192 EPYC’Trento ’64 core CPU, 1536 Instinct MI250X GPU, 40P FLOP horsepower

Oak Ridge National Laboratory Release Overview of the crusher system with AMD’s optimized 3rd generation EPYC CPU and Instinct MI250X GPU.

Overview of all AMD-powered crusher systems from ORNL: Feature-optimized 3rd Generation EPYC CPUs and Instinct MI250X GPUs

The crusher system is ORNL’s future test platform. Frontier supercomputer It is equipped with the latest AMD EPYC “Trento” CPU and Instinct MI250X “Aldebaran” GPU. So the number of nodes is small, but it’s still a lot of punch given the large amount of CPU / GPU cores in it.

ASUS announces AMDAGESA beta BIOS firmware for ROG Crosshair VIII motherboards

Crusher is a medium security system of the National Center for Computational Sciences (NCCS) that contains the same hardware and software as the next frontier system. It will be used as an early access testbed for the Center for Accelerated Application Readiness (CAAR) and Exascale Computing Project (ECP) teams, as well as NCCS staff and vendor partners.


According to the overview published by ORNL, the Crushes test system consists of two cabinets. One has 128 compute nodes and the other has 64 compute nodes, for a total of 192 compute nodes in a complete configuration. Each node has a single 64-core AMD EPYC 7A53 CPU based on the 3rd generation optimized EPYC CPU architecture. Frontier will be equipped with AMD’s torrent CPU, an optimized version of the Milan chip. It has the same 64 cores and 128 threads, but with optimized clock and power efficiency. Each CPU can access 512GB DDR4 memory.

On the GPU side, each node will have 4 AMDs Instinctive MI250X GPU, Packing two GCDs allows Crusher to access a total of eight GPUs, as each node treats the GCD as a separate GPU. Each MI250X GPU provides up to 52 TFLOP peak FP64 hp, 220 compute units (110 per GCD), and 128 GB of HBM2e memory (64 GB per GPU), with up to 3.2 TB / s bandwidth per MI250X accelerator. Offers. Each GCD is interconnected via an Infinity Fabric link that provides bidirectional bandwidth of 200 GB / s.

When it comes to interconnects, AMD EPYC CPUs are connected to the GPU using Infinity Fabric with a peak bandwidth of 36 + 36 GB / s. Crusher nodes are connected via four HPE Slingshot 200 Gbit / s NICs (25 GB / s) and provide 800 Gbps (100 GB / s) node injection bandwidth.

There is [4x] With NUMA domains per node [2x] Total L3 cache area per NUMA [8x] L3 cache area. Each of the eight GPUs is associated with one of the L3 regions as follows:


  • Hardware Threads 000-007, 064-071 | GPU 4
  • Hardware threads 008-015, 072-079 | GPU 5


  • Hardware thread 016-023, 080-087 | GPU 2
  • Hardware Thread 024-031, 088-095 | GPU 3


  • Hardware Thread 032-039, 096-103 | GPU 6
  • Hardware Thread 040-047, 104-111 | GPU 7


  • Hardware Thread 048-055, 112-119 | GPU 0
  • Hardware Thread 056-063, 120-127 | GPU 1

The following single crusher node block diagram shows the interconnection bandwidth between the AMD EPYC CPU and the Instinct MI250X GPU accelerator.

AMD RAMP is AMD’s Ryzen 7000 CPU XMP, accelerating DDR5 memory on AM5 platforms

In addition, the Crusher system hots 250 PB of storage and has a peak write speed of 2.5 TB / s, providing access to a center-wide NFS-based file system. Expect more to see from AMD’s EPYC CPU and Instinct GPU platforms as they run on frontier supercomputers this year.

News source: Coelacanth dream 192 EPYC’Trento ’64 core CPU, 1536 Instinct MI250X GPU, 40P FLOP horsepower

Back to top button