Intel and Micron today announced that the new version of Intel’s Xeon Phi, a highly parallel coprocessor for research applications, will be built using a custom version of Micron’s Hybrid Memory Cube, or HMC.
This is only the second announced application for this new memory product – the first was a Fujitsu supercomputer back in November.
For those who, like me, were unfamiliar with the Xeon Phi, it’s a module that uses high core-count processors for problems that can be solved with high degrees of parallelism. My friend and processor guru Nathan Brookwood tells me that it uses a SIMD architecture (Single-Instruction, Multiple-Data) in which the same instruction is executed across tens of data sets simultaneously. This works well for algorithms that can be solved using linear algebra (matrices) which includes graphics, weather, big data analytics, earthquakes, genome sequencing, combustion, nuclear reactions, and aerodynamics. In fact, graphics processing units (GPUs) from nVidia and AMD use this approach, and these companies have beat out Intel in the past to win supercomputer designs, a situation that Intel wants to rectify once and for all. Intel and Micron tell me that, with the help of the HMC, the newly-released version of the Xeon Phi (formerly known as Knights Landing) will have five times the bandwidth of the DDR4-based alternative running at about 1/5th the power of the GDDR5-based Knights Corner system introduced in 2012.
A processor like this needs a lot of memory bandwidth to keep it fed, and this is where “The Memory Wall” can do its most damage. The Memory Wall is a term used to describe the fact that memory is generally incapable of supporting a processor at its highest speed, and thus limits its performance. The Knights Landing approaches this problem by using the HMC’s very high bandwidth, and by running eight HMCs, each with its own processor-memory bus, in parallel on the Knights Landing module.
As a refresher, an HMC is a stack of DRAM chips all interconnected through thousands of through-silicon vias (TSVs) to an underlying interface chip built using a logic process. Logic is far more efficient at driving I/O signals than a DRAM process is, and it runs faster. Micron tells us that one HMC provides 15 times the bandwidth of a DDR3 module using 70% less energy while fitting into a smaller footprint.
Why is this important? Intel asserts that there’s a waterfall from today’s supercomputers to the single-socket processor of the future. According to Intel, today’s leading edge technology will become a single-socket solution in about 15-17 years. Intel is using this new processor to work out the architecture of systems in 2030.
From Micron’s perspective this will provide a big boost to HMC consumption. Intel has announced the first Knights Landing design win, the Cori system designed by the National Energy Research Scientific Computing Center (NERSC). NERSC, Intel, and Cray formed a partnership to design this new supercomputer to accelerate “extreme-scale” scientific discovery. The system was named “Cori” in honor of bio-chemist Gerty Cori, the first female US citizen to receive a Nobel Prize in science. Cori will use over 9,300 Knights Landing nodes each of which (to my understanding) will sport eight HMCs, with each HMC built using four DRAM chips and one logic chip, totaling about 300,000 DRAM chips, not including spares – pretty good for a single system.
Intel calls Cori: “The first of many” Knights Landing supercomputers implying that we will hear several more announcements based on this new rendition of Xeon Phi. I am sure that Micron hopes this will come true.