Oracle Corporation

04/23/2024 | Press release | Distributed by Public on 04/23/2024 11:41

On-demand Linux-based GPU rendering in HPC

Remote visualization ecosystems have been an integral part of virtual product design process for multiple decades. While the initial years involved CPU-based software rendering, advancement in the underlying stack and pre- and post-processing tools has made 3D hardware rendering more popular, especially in the last decade. Designers and engineers working on large and complex 3D engineering models can now pre- and post-process data in place, without moving it back and forth between their local workstation and high-performance computing (HPC) clusters, which can take days, even over high bandwidth dedicated network pipes.

In this blog post, we share how engineers and researchers can use the autoscaling functionality of HPC stack available in the Oracle Cloud Marketplace to on-demand provision a 3D rendering capable Linux server in an existing HPC cluster. This entire solution is based on open-source tools, making it a zero-cost environment from a software perspective. You pay only for underlying hardware for the time you use it and not the software tools.

Environment for our autoscaling use case

For our example use case, we already have an HPC cluster running with autoscaling functionality enabled in our Oracle Cloud Infrastructure (OCI) tenancy and the required limits in place to also provision a virtual shape VM.GPU.A10.1 powered by NVIDIA A10 Tensor Core GPU. We also have a preconfigured operating system image for the GPU shape with graphic driver, virtual GL libraries, and Turbo VNC. We won't go into the configuration details of the OS, image, or remote visualization tool. Instead, we use a preconfigured OS image to share the overall concept.

Architecture

Figure 1: Simplified Example of HPC VDI Architecture

Enterprise-class HPC environments are generally comprised of management servers, compute and GPU nodes, Linux- or Windows-based virtual desktops, and high-throughput storage file systems. OCI offers a wide range of Intel- and AMD-based virtual machine (VM) and bare metal shapes for building the overall ecosystem. For a complete list of currently available shapes, see Compute shapes. We're using a NVIDIA A10 based virtual instance for on-demand GPU rendering. The shape name and specs of the same are shared below.

  • Shape: VM.GPU.A10.1
  • OCPU: 15
  • GPU Memory (GB): 24
  • CPU Memory (GB): 240
  • Local Disk (TB): Block Storage Only
  • Max Network Bandwidth (Gbps): 24
  • Max VNICs (Linux): 15
  • Max VNICs (Windows): 15

Enabling on-demand provisioning in HPC

On-demand provisioning of compute shapes in an already running HPC cluster requires updating the configuration of both batch job scheduler and autoscaling functionality part of HPC stack.

The HPC stack available in Oracle Cloud Marketplace uses SLURM as the job scheduler. To use the NVIDIA A10 GPU-based shape VM.GPU.A10.1 for GPU rendering, our first step is to update its technical specs in SLURM with a "state of future." This selection indicates that the node is defined for future use and doesn't need to exist when SLURM daemons are started. For more details, refer to SLURM workload manager's configuration guide .

Next, we update the HPC stack's autoscaling configuration file, queues.conf. The stack provides an example file, /opt/oci-hpc/conf/queues.conf.example, for easy understanding of how to configure multiple queues and multiple instance types.

Unlike OCI's native autoscaling feature, which relies on instance pools to provision more compute resources and is based on either metrics or schedule, HPC stack's autoscaling functionality allows on-demand provisioning of clusters per job and automatically destroys them when they are no longer in use or idle. We use this functionality to provision GPU-based shapes as remote visualization servers for 3D rendering and pre- and post-processing of large engineering models in place. This setup not only helps in optimal utilization of GPU resources, but also brings down the overall cost of running remote visualization farm in a HPC cluster.

For more information on HPC stack's autoscaling functionality, see the Autoscaling section under the stack's usage information in High Performance Computing - RDMA cluster network.

Submitting an interactive job

After you have successfully configured the batch scheduler and autoscaling, you can submit an interactive job using the SLURM command, srun. In our setup, we used the following parameters and their associated values:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
# srun --partition=compute --gpus=1 --constraint=a10 --pty /bin/bash

where "--partition" refers to the SLURM queue, "--gpu" refers to the count of GPUs required, and "-constraint" to the name defined under "instance_types" definition in autoscaling configuration file "queues.conf."

The autoscaling framework automatically initiates provisioning the GPU shape with a preconfigured image in the backend. The process takes about five minutes for the command to drop you at the bash prompt of newly provisioned GPU system that also has access to all HPC shared storage folders. With the GPU system successfully provisioned and available for use, we initiate a VNC session on it by running the command, vncserver, with the appropriate arguments.

For example, in our custom image, we have a Turbo VNC server configured to use the one-time password feature, VirtualGL libraries, "Mate" as Windows manager, and we used "vncserver" command with following arguments:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
# vncserver -vgl -otp -wm mate-session

Desktop 'TurboVNC: compute-gpu-node-172:1 (opc)' started on display compute-gpu-node-172:1

One-Time Password authentication enabled.  Generating initial OTP ...

Full control one-time password: XXXXXXXX

Run '/opt/TurboVNC/bin/vncpasswd -o' from within the TurboVNC session or

    '/opt/TurboVNC/bin/vncpasswd -o -display :1' from within this shell

    to generate additional OTPs

Starting applications specified in /opt/TurboVNC/bin/xstartup.turbovnc

(Enabling VirtualGL)

Log file is /home/opc/.vnc/compute-gpu-node-172:1.log

#

You can use the password and other information returned by the VNC server command, such as hostname, to establish a remote desktop session with the help of a VNC client installed on your local workstation, such as Tiger VNC Viewer.

We ran the standard GLX Spheres benchmark from India inside a VNC session hosted on the VM.GPU.A10.1 server at one of our regions in Europe and were able to achieve more than 400 frames/second, as shown in the following figure.

Figure 2: GLX Spheres benchmark output (frames/sec)

Summary

OCI doesn't just offer its customers different GPU shapes with latest and best-in-class graphic cards for GPU and 3D hardware rendering requirements. It also offers optimal ways to use them efficiently in a Linux-based HPC cluster.

We encourage you to connect with Oracle's HPC team and evaluate the solution on Oracle Cloud Infrastructure with a 30-day free trial. For more information, see the following resources: