Brussels / 4 & 5 February 2023

schedule

How to deal with validation as an HPC software?

An approach to power software testing at scale


Scientific Computing is constantly evolving, relying on technologies increasing in complexity. Codes produced in that field require testing and validation to assess their performance and reliability. This additional but inevitable task is often of low direct added value compared to the deployment costs. Yet multiple solutions dedicated to testings, including some HPC specific, are currently available -- our solution has unique specificities. "Parallel Computing Validation System" (PCVS) is an HPC-aware YAML-based job orchestration tool. It offers the unique capability of retargeting tests, decoupling benchmarks and execution environments. This way, it allows the same job set to be re-run to compare two standards without modifying test specifications. This validation set may be scaled automatically depending on available resources, whether the process runs on a single node (like a workstation) or a thousand-node supercomputer. Beyond a one-time shot, PCVS can log several successive executions of benchmarks for browsing, inspection, and post-processing through a dedicated Python interface. More than a metric, PCVS can build validation trends, providing better visualization to track project evolution, leading to better software quality.

In this presentation, we introduce PCVS, a powerful and user-friendly job orchestration tool designed to streamline and scale test workflows. Utilizing a simple and intuitive YAML syntax, PCVS allows for flexible and efficient scheduling of tests based on massively parallel resources. Our retargeting model allows for the remapping of benchmark workflows, commonly associated with compilation and execution phases, across multiple environments. This approach has been used to build an high-quality MPI evaluation system, higlighting differences between API support among multiple implementations, demonstrating the potential of PCVS to assess API/ABI support across different implementations. Initially designed for validation processing, PCVS is versatile and can handle a wide range of use cases, from single test directory to large cluster-wide application stack. It should be viewed not as a test framework but as a coordinator between existing test bases and supercomputers.

Speakers

Julien Adam

Attachments

Links