Highlights - June 2016
Sunway TaihuLight, a system developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC) and installed at the National Supercomputing Center in Wuxi, which is in China's Jiangsu province is the No. 1 system with 93 petaflop/s (Pflop/s) on the Linpack benchmark. The system has 40,960 nodes, each with one SW26010 processor for a combined total of 10,649,600 computing cores. Each SW26010 processor is composed of 4 MPEs, 4 CPEs, (a total of 260 cores), 4 Memory Controllers (MC), and a Network on Chip (NoC) connected to the System Interface (SI). Each of the four MPEs, CPEs, and MCs have access to 8GB of DDR3 memory. The system is based on processors exclusively designed and built in China. The Sunway TaihuLight is almost three times as fast and three times as efficient as Tianhe-2, the system it displaces in the number one spot. The peak power consumption under load (running the HPL benchmark) is at 15.371 MW or 6 Gflops/W. This allows the TaihuLight system to hold one of the top spots on the Green500 in terms of the Performance/Power metric.
Highlights from the Top 10
Sunway TaihuLight is the only new systems in the Top 10 and its appearance pushes each of the other system down one position.
- Tianhe-2 (Milky Way-2), a system developed by China’s National University of Defense Technology (NUDT) and deployed at the National Supercomputer Center in Guangzho, China is now the No. 2 system with 33.86 petaflop/s (Pflop/s) on the Linpack benchmark. Tianhe-2 was the No.1 system in the TOP500 list for the past 3 years (6 lists)
- Titan, a Cray XK7 system installed at the Department of Energy’s (DOE) Oak Ridge National Laboratory, is now the No.3 system. It achieved 17.59 Pflop/s on the Linpack benchmark using 261,632 of its NVIDIA K20x accelerator cores.
- Sequoia, an IBM BlueGene/Q system installed at DOE’s Lawrence Livermore National Laboratory, is now the No. 4 system. It was first delivered in 2011 and has achieved 17.17 Pflop/s on the Linpack benchmark using 1,572,864 cores.
- Fujitsu’s K computer installed at the RIKEN Advanced Institute for Computational Science (AICS) in Kobe, Japan, is the No. 5 system with 10.51 Pflop/s on the Linpack benchmark using 705,024 SPARC64 processing cores.
- Mira, a BlueGene/Q system installed at DOE’s Argonne National Laboratory, is No. 6 with 8.59 Pflop/s on the Linpack benchmark using 786,432 cores.
- Trinity, a Cray X40 system installed at DOE/NNSA/LANL/SNL and which joined the TOP 10 last year is now No. 7 with 8.1 Pflops/s and 301,056 cores.
- At No. 8 is Piz Daint, a Cray XC30 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland and the most powerful system in Europe. Piz Daint achieved 6.27 Pflop/s on the Linpack benchmark using 73,808 NVIDIA K20x accelerator cores.
- Hazel Hen, a Cray XC40 system installed at HLRS in Stuttgart is at No. 9 with 5.64 Pflop/s using 185,088 cores.
- Shaheen II, a Cray XC40 system installed at King Abdullah University of Science and Technology (KAUST) in Saudia Arabia is at No. 10 with 5.536 Pflop/s on the Linpack benchmark using 196,608 Intel Xeon E5-2698v3 cores.
Highlights from the Overall List
- The number of systems installed in China has increased dramatically to 167, compared to 109 on the last list. China is now at the No. 1 position as a user of HPC. Additionally, China now is at No. 1 position in the performance share thanks to the big contribution of the systems at No. 1 and No. 2.
- The number of systems installed in the USA declines sharply and is now at 165 systems, down from from 199 in the previous list. This is the lowest number of systems installed in the U.S. since the list was started 23 years ago.
- The overall list-by-list growth rates of performance continues to recover after historical low values in the past 3 years.
- The performance of the last system on the list ( No. 500) has systematically continued to lag behind historical trends for the last 6 years and now clearly continues to run on a different growth trajectory than before. From 1994 to 2008 it grew by 90 percent per year. Since 2008 it has only grown by 55 percent per year.
- The growth of the average performance of all systems in the list has slowed since 2013 as well and has also dropped to about 55 percent per year.
- There are 95 systems with performance greater than a Pflop/s on the list, up from 81 six months ago.
- In the Top 10, the No. 2 system, Tianhe-2, uses Intel Xeon Phi processors to speed up their computational rate. The No. 3 system Titan, the No. 8 system Piz Daint is using NVIDIA GPUs to accelerate computation.
- A total of 93 systems on the list are using accelerator/co-processor technology, down from 104 on November 2015. Sixty-seven (67) of these use NVIDIA chips, 26 systems with Intel Xeon Phi technology, three use ATI Radeon, and 2 use PEZY technology. Three systems use a combination of Nvidia and Intel Xeon Phi accelerators/co-processors.
- The average number of accelerator cores for these 94 systems is 76,000 cores/system.
- Intel continues to provide the processors for the largest share (91 percent) of TOP500 systems.
- Ninety-eight (98.2) percent of the systems use processors with six or more cores, eighty-four (84.8) percent use eight or more cores, and sixty-three (63.6) percent ten or more cores.
- Cray XC series is now the most popular system in the TOP 10 with four entries including the No. 7, 8, 9 and 10 systems. A Cray XK7 system remains at number 3 making Cray the dominant vendor in the TOP 10 with 5 systems, the same number of systems six months ago.
General highlights from the TOP500 since the November 2015 edition
- The entry level to the list moved up to the 285.9 Tflop/s mark on the Linpack benchmark, compared to 206.3 Tflop/s six months ago.
- The last system on the newest list would have been listed at position 351 in the previous TOP500. This represents a slight recovery when compared to the last list.
- Total combined performance of all 500 systems has grown to 566.7 Pflop/s, compared to 420 Pflop/s six months ago and 363 Pflop/s one year ago. This increase in installed performance also exhibits a noticeable slowdown in growth compared to the previous long-term trend.
- The entry point for the TOP100 increased in six months to 958 Tflop/s, up from 816 Tflop/s.
- The average concurrency level in the TOP500 is 81,995 cores per system, up from 58,596 six months ago and 50,495 one year ago.
Vendor Trends
- A total of 455 systems (91 percent) are now using Intel processors, slightly up from 89 percent six months ago.
- The share of IBM Power processors is now at 23 systems, down from 26 systems six months ago.
- The AMD Opteron family is used in 13 systems (2.6 percent), down from 4.2 percent on the previous list.
- InfiniBand technology is now found on 205 systems, down from 235 systems, and is now the second most-used internal system interconnect technology. Gigabit Ethernet has risen to 218 systems up from 182 systems, in large part thanks to 176 systems now using 10G interfaces.
- Following its acquisition of IBM’s x86 business 2 years ago, Lenovo now has 84 systems in list, up from 25 system six months ago.. Some systems that were previously listed as IBM are now labeled as both IBM/Lenovo (4 systems) and Lenovo/IBM (4 systems).
- HPE has the lead in systems and now has 127 systems (25.4 percent) followed by Lenovo with 84 systems. Cray now has 60 systems, down from 69 systems six month ago . HPE had 155 systems six months ago. IBM is now 5th in the systems category with 38 systems.
Performance Trends
- Cray continues to be the clear leader in the TOP500 list in performance and has a considerable lead with a 19.9 percent share of installed total performance (down from 25 percent).
- Thanks to the Sunway TaihuLight system, NRCPC takes the second spot with 16.4 percent of the total performance.
- IBM takes the third spot with 10.7 percent share, down from 14.9 percent six months ago.
- Thanks to Tianhe-2 and Tianhe-1A, NUDT contributes 9.2 percent of the total performance of the list, down from 10.9 percent.
- HPE is third with 12.9 percent, down from 14.2 percent six months ago.
Geographical Observations
- The U.S., the leading consumer of HPC systems since the inception of the TOP500 lists is now second for the first time after China with 165 of the 500 systems. China leads the systems and performance categories now thanks to the No.1 and No. 2 system and a surge in industrial and research installations registered over the last few years. The European share (105 systems compared to 107 last time) has fallen and is now lower than the dominant Asian share of 218 systems, up from 173 in November 2015.
- Dominant countries in Asia are China with 167 systems (up from 109) and Japan with 29 systems (down from 37).
- In Europe, Germany is the clear leader with 26 systems followed by France with 18 and the UK with 12 systems.
Green500
- The data collection and curation of the Green500 project has been integrated with the TOP500 project. This allows submissions of all data through a single webpage at http://top500.org/submit
- The most energy-efficient system and #1 on the Green500 is Shoubu, a PEZY Computing / Exascaler ZettaScaler-1.6 System at the Advanced Center for Computing and Communication, RIKEN, Japan at 6.67 GFlops/Watt.
- #2 is the Satsuki system at the Computational Astrophysics Laboratory, RIKEN with the same architecture as the #1, but smaller in size with 6.196 GFlops/Watt.
- #3 on the Green500 is Sunway TaihuLight, the new #1 of the TOP500 at 6.05 GFlops/Watt.