Brussels / 2 & 3 February 2019

schedule

Reproducible science with containers on HPC through Singularity

Singularity containers


One of the biggest problems in scientific HPC is ensuring that results are reproducible. That is, the code a scientist runs locally must be able to run identically on any computational resource. Until recently, the job of ensuring that fell to system administrators who needed to manage a complex web of tools and dependencies on those resources. However, with the introduction of HPC containers via Singularity, the ability to mobilize the compute environment has never been easier. Singularity allows anybody to run their own containers on HPC, ushering in a new era of computational mobility, validity, and reproducibility.

Singularity is the most widely used container solution in high-performance computing (HPC). Enterprise users interested in AI, Deep Learning, compute drive analytics, and IOT are increasingly demanding HPC-like resources. Singularity has many features that make it the preferred container solution for this new type of “Enterprise Performance Computing” (EPC) workload. Instead of a layered filesystem, a Singularity container is stored in a single file. This simplifies the container management lifecycle and facilitates features such as image signing and encryption to produce trusted containers. At runtime, Singularity blurs the lines between the container and the host system allowing users to read and write persistent data and leverage hardware like GPUs and Infiniband with ease. The Singularity security model is also unique among container solutions. Users build containers on resources they control or using a service like the Sylabs Remote Build Service. Then they move their containers to a production environment where they may or may not have administrative access and the Linux kernel enforces privileges as it does with any other application. These features make Singularity a simple, secure container solution perfect for HPC and EPC workloads.

One of the biggest problems in scientific HPC is ensuring that results are reproducible. That is, the code a scientist runs locally must be able to run identically on any computational resource. Until recently, the job of ensuring that fell to system administrators who needed to manage a complex web of tools and dependencies on those resources. However, with the introduction of HPC containers via Singularity, the ability to mobilize the compute environment has never been easier. Singularity allows anybody to run their own containers on HPC, ushering in a new era of computational mobility, validity, and reproducibility.

Singularity is the most widely used container solution in high-performance computing (HPC). Enterprise users interested in AI, Deep Learning, compute drive analytics, and IOT are increasingly demanding HPC-like resources. Singularity has many features that make it the preferred container solution for this new type of “Enterprise Performance Computing” (EPC) workload. Instead of a layered filesystem, a Singularity container is stored in a single file. This simplifies the container management lifecycle and facilitates features such as image signing and encryption to produce trusted containers. At runtime, Singularity blurs the lines between the container and the host system allowing users to read and write persistent data and leverage hardware like GPUs and Infiniband with ease. The Singularity security model is also unique among container solutions. Users build containers on resources they control or using a service like Singularity Hub. Then they move their containers to a production environment where they may or may not have administrative access and the Linux kernel enforces privileges as it does with any other application. These features make Singularity a simple, secure container solution perfect for HPC and EPC workloads.

Singularity blocks privilege escalation within the container so if a user wants to be root inside the container, it must be root outside the container. This usage paradigm mitigates many of the security concerns that exists with containers on multi-tenant shared resources. You can directly call programs inside the container from outside the container fully incorporating pipes, standard IO, file system access, X11, and MPI. Keywords HPC, Linux Containers, Singularity, Docker, LXC, portability

References Kurtzer, G. M. (2016). Singularity 2.1.2 - Linux application and environment containers for science. Zenodo. http://doi.org/10.5281/zenodo.60736 LinuxContainers.org Infrastructure for container projects. Project sponsored by Canonical Ltd. https://linuxcontainers.org/

Speakers

Photo of Eduardo Arango Eduardo Arango

Links