July 31, 2018
By: Michael Feldman
The Texas Advanced Computing Center (TACC) has won the competition for the NSF’s latest leadership-class supercomputer. The machine is scheduled to be installed at the University of Texas at Austin in 2019 and is expected to be in operation for at least five years.
The award was made under an NSF program that would provide $60 million in funding toward the procurement of an HPC system for a supercomputing center associated with a US academic institution. It represents Phase 1 of a two-phase program that extends beyond the five-year timeframe of the initial system
The Phase 1 machine is expected to deliver two to three times the application performance of the Blue Waters supercomputer hosted by the National Center for Supercomputing Applications (NCSA) at the University of Illinois. In this case, application performance is determined by the Sustained Petascale Performance (SPP) Benchmarks, which maps to Blue Waters current workload profile, as well as by performance analysis of other anticipated applications.
When it was deployed in 2013, Blue Waters was the most powerful of the NSF-funded supercomputers, delivering a peak performance of 13.3 petaflops. At the time, those flops came at a price tag of about $200 million dollars, dwarfing the current $60 million for the upcoming TACC system. Obtaining two to three times the application performance of Blue Waters for that money shouldn’t be too much of a stretch these days though. Currently, the most powerful NSF-funded machine is the $30 million Stampede2 supercomputer (also at TACC), a 2017-era machine powered by Intel Xeon and Xeon Phi processors and maxing out at 18.3 peak petaflops. Assuming the peak/application flops ratio is about the same for all these systems, the new TACC supercomputer will have to be in the range of 26 to 50 peak petaflops.
As a result, this machine will end up being the most powerful academic supercomputer in the US when it comes online in 2019. And although it’s characterized as a leadership-class system, it will be significantly less powerful than Summit (187.7 petaflops) and Sierra (119.2 petaflops), the top two supercomputers at Department of Energy (DOE) national labs. NSF-sponsored systems have slowly been losing ground to the DOE machines, which have the advantage of being able draw on a much larger funding base. Although recent NSF machines have all been a petaflop of better, the last time one of these supercomputers was ranked in the top 10 of the TOP500 list was November 2015.
In general, the NSF has retreated from the big system approach and is spreading its money across smaller projects, most of which are not earmarked for hardware procurements. Meanwhile, the DOE Office of Science has been expanding its supercomputing footprint into the academic space and routinely partners with universities for various HPC application projects – not all of which are related the agency’s core mission of fostering energy production and security. This division of labor no longer makes much sense, especially when you consider that DOE Office of Science funding has generally grown in concert with the demands of the researchers needing HPC, while NSF funding has not.
That’s leaves NSF supercomputing centers who don’t get these big awards looking to other agencies, industry partners, or the even the universities themselves to help fund new systems. The only good news for these centers is that price-performance of HPC hardware continues to improve despite a slowing Moore’s Law.
Having won the Phase 1 award, TACC will be tasked to develop a project plan for the design of the Phase 2 system. From the way the original proposal was written, this second machine will be developed as an “upgrade design” based on the first one. The Phase 2 system is expected to have a “ten-fold or more time-to-solution performance improvement” over the Phase 1 system. A deployment date was not specified for this second machine.
Image: Advanced Computing Center, The University of Texas at Austin