HPC and the Modern Day Scientist

Researchers from within the disciplines of science, technology, engineering, and mathematics have always been traditional users of high-performance computing (HPC). Today, particularly in university environments, a new generation of researchers from every department on campus – from anthropology to linguistics to wildlife management – is competing for HPC resources and expectations are high. It falls to the mostly overburdened research computing infrastructure teams to meet these demands and provide these resources.

As virtualized HPC environments become the norm to cope with demand and deliver the performance, agility, and flexibility requirements of an increasingly varied end-user, it is inevitable that requirements for data storage capabilities also start to grow and diversify.

 

The Data Center Deluge

Managing, provisioning, and monitoring very high scale workloads and large datasets have been par for the course for HPC data centers for a long time. But now, the advanced computational capabilities and affordability of hardware, the rise of software-defined-everything, and the advent of Big Data, AI and ML, is forcing data centers admins to devise innovative solutions to address the storage challenges these trends create. How does one plan and prepare for unpredictable demand? How are various data sets catered for? How are costs kept under control? And all this while continuing to optimize performance.

 

A Storage Solution

Ceph, the leading open-source software-defined storage solution, is already playing a well-documented role in modern-day HPC data centers from small installments all way through to the likes of CERN. Ceph is flexible, inexpensive, fault-tolerant, hardware neutral, and infinitely scalable which makes it an excellent choice for research institutions of any size and vertical. And, because most research organizations have unique storage requirements, vendor lock-in can be avoided altogether. Other benefits include:

  • Ceph supports multiple storage types including object, block, and file systems. Regardless of the type of research being conducted, the resulting files, blocks and/or objects can all live in harmony in Ceph.
  • Ceph is hybrid-cloud ready, natively supporting hybrid cloud environments which makes it easy for remote researchers, who might be located anywhere in the world, to upload their data in different storage formats.
  • Ceph is hardware-neutral and doesn’t require highly performant hardware, which lowers equipment costs and eliminates vendor lock-in.
  • Ceph is resilient: there’s no need to buy redundant hardware in case a component fails, because Ceph’s self-healing functionality quickly replicates the failed node, ensuring data redundancy and higher availability.

SoftIron makes the world’s best, unified storage solutions for the hyperscale data center. The HyperDrive® storage appliance is custom-designed and built to optimize Ceph, unleashing the full potential of the technology for the HPC data center. HyperDrive is a high-performance, scale-out solution that runs at wire-speed, and at less than 100W per 1U form factor.

 

University of Kentucky Case Study

The University of Kentucky took a fresh approach by creating a virtualized environment that supports both cloud tasks and most HPC tasks, but without costly specialized HPC hardware. When they set out to use CephFS to provide data storage for their virtualized HPC clusters and encountered performance bottlenecks, they turned to SoftIron.

Read more about how the University of Kentucky meets the demands of a new generation of researchers here: Download the University of Kentucky case study.