NetApp Inc.

09/26/2022 | Press release | Archived content

How Yale New Haven Health transformed their data infrastructure

"Over the last 2 years, we've been figuring out how to migrate off of Hadoop and into something more cost effective and agile." Wade Schulz, MD, PhD, Director, CORE Center for Computational Health, Center for Outcomes Research & Evaluation

Yale needed to centralize and integrate data from numerous sources to be easily used on any data-science platform to research inpatient care and healthcare outcomes. COVID-19, with its variants and increased clinical cases, created a huge influx of data that exposed the shortcomings of the existing big-data infrastructure. Yale needed a new infrastructure to support their influential research on the disease and its cures.

Before selecting NetApp®, Yale ran a mostly on-premises analytics environment on Hortonworks Data Platform based on Hadoop for capturing, processing, and analyzing data. However, increasing licensing costs became a concern. Additionally, managing the data lifecycle became increasingly difficult. Budget constraints and failing hardware made it difficult to justify expanding their existing environment. It was time for a new computational health platform. Enter NetApp.

"NetApp was selected because of its price/performance, scalability, and the availability of tools that facilitate the transfer of data between platforms." Dr. Wade Schulz

The goal was to create a more flexible, future-proof system that was cloud ready. NetApp came in with a plan to:

  • Maximize data value with a unified data lake
  • Make sure that the right tools and skillsets are in place
  • Provide data access where and when it's needed
  • Protect patients' sensitive information with enhanced security and governance

Yale's computational health platform was born out of an agile, disaggregated architecture better suited to their artificial intelligence and machine learning workflows. Memory and storage are now used in a far more efficient manner, and licensing costs were reduced by $500k. Additionally, the data lake enables easier adoption of technology such as GPUs.

With NVIDIA DGX Foundry, data scientists can easily get to work without IT building a complex AI infrastructure themselves. The platform is fully managed by NVIDIA and NetApp, allowing Yale to focus on research and outcomes instead of managing infrastructure.