General Atomics

09/27/2021 | Press release | Distributed by Public on 09/27/2021 16:31

General Atomics Scientists Use Cloud Supercomputing for Fusion Research

Latest hardware runs robust plasma simulations that once required high-end supercomputers

San Diego, 27 September 2021 - Experts at General Atomics (GA), working together with the San Diego Supercomputer Center (SDSC) and Canadian firm Drizti Inc., have developed a prototype for running fusion reactor simulations in the cloud. The idea may significantly simplify the modeling process for fusion reactor designs and help bring fusion energy closer.

The new approach combines Microsoft's Azure platform with Drizti's supercomputing-as-a-service HPCBOX solution and GA's industry-leading CGYRO physics code. It promises to substantially reduce the challenges with fusion plasma simulations, which in the past required leading supercomputers.

Scientists rely heavily on advanced computer modeling to predict plasma conditions inside a fusion reactor. The extreme complexity of these plasmas - which are heated to temperatures many times that of the sun to cause hydrogen isotopes to fuse into heavier elements - means that even limited simulations require significant processing power. Intense competition for processing time on systems like the Department of Energy's Summit means that comparable alternatives are very desirable. Until recently, there have been few other options.

Advances in hardware capabilities, especially NVIDIA graphic processing units (GPUs), have recently made high-performance computing (HPC) in the cloud a feasible proposition. Microsoft Azure has been the leading player in this field, investing heavily in virtual machines (VMs) with low latency networking. (A "virtual machine" is a software-based emulation of a physical system, allowing processing that isn't tied to a specific hardware location and more efficient and flexible distribution of computing resources.)

Azure's most recent VM is the ND A100 v4-series, comprised of eight NVIDIA A100 GPUS and eight Mellanox HDR InfiniBand network adapters. When compared at small node counts like these, such cloud systems can match or exceed what is available at leading on-premises systems like Summit (though Summit's full capabilities are far greater).

However, more is necessary for scientists to make efficient use of these resources.

"Exceptional hardware capabilities are not enough for scientists to make good use of cloud HPC resources," said Igor Sfiligoi, a HPC software developer at SDSC, located at UC San Diego. "Scientists at on-premises facilities are typically working with a strictly prescribed and very optimized setup. This is very different from the wide-open and flexible cloud computing environment. That meant we still had to develop a convenient interface."

To address that challenge, GA and personnel from SDSC engaged with Toronto-based personal supercomputing firm Drizti Inc. Drizti offers a service called HPCBOX, which provides a simple-to-use environment for accessing Azure.

Microsoft has thus far marketed the ND A100 v4 VMs exclusively for artificial intelligence workloads. Drizti developed the HPCBOX to provide a true HPC setup, making full use of the massive processing and networking capabilities of Azure. Collaborating with GA and SDSC personnel, Drizti developed a CGYRO plugin inside HPCBOX, allowing for a completely push-button solution for fusion scientists.

"At Drizti, we are always looking to push the capabilities of our HPCBOX platform, and this was an amazing opportunity to be part of cutting-edge innovation," said Dev Subramanian, Drizti's chief technology officer. "We are extremely excited to be able to combine HPCBOX with Microsoft Azure and join forces with SDSC and General Atomics to deliver turn-key supercomputing for accelerating innovation and contribute to the advancement of science."

The collaboration between Drizti and the research personnel also resulted in HPCBOX now supporting the lower-priced "spot" VMs, which are generally 60% cheaper than their standard counterparts because their availability depends on system usage, and access can be lost if a regular user needs it. While spot-priced resources do not provide availability guarantees, they can be the most cost-effective solution for applications with reliable checkpointing capabilities to save in-progress work, like CGYRO. HPCBOX now fully supports spot resources, managing all aspects of their lifetime, including job restarts in case of preemptions.

GA is currently investigating the feasibility of shifting some medium-scale fusion plasma simulations to HPCBOX.

About General Atomics

Since the dawn of the atomic age, General Atomics innovations have advanced the state of the art across the full spectrum of science and technology - from nuclear energy and defense to medicine and high-performance computing. Behind a talented global team of scientists, engineers, and professionals, GA's unique experience and capabilities continue to deliver safe, sustainable, economical, and innovative solutions to meet growing global demands.

For more information contact:

Zabrina Johal
Senior Director of Strategic Development
[email protected]