The Snakemake SLURM Plugin: Reproducible Computing vs. HPC Policies
- Track: HPC, Big Data & Data Science
- Room: H.1308 (Rolin)
- Day: Sunday
- Start: 09:30
- End: 09:55
- Video only: h1308
- Chat: Join the conversation!
In the pursuit of reproducible, scalable bioinformatics workflows, tools like Snakemake, Nextflow, and Galaxy have become indispensable. Yet, deploying them on high-performance computing (HPC) systems — where SLURM reigns as the dominant batch scheduler — remains fraught with challenges. This talk recounts the development of the official SLURM plugin for Snakemake (https://doi.org/10.12688/f1000research.29032.3; https://doi.org/10.5281/zenodo.16922261), a journey shaped less by code and more by the idiosyncrasies of HPC environments. From GPU and MPI support to threaded applications, the plugin had to accommodate diverse computational needs — but the real hurdles lay in administrative policies: login nodes off-limits, partition naming chaos, and cluster-specific layouts and policies that defy standardization. I’ll share how the plugin evolved to accommodate the needs of data analysts - from Santa Cruz to Okinawa, from Stellenbosch to Uppsala. Whether you’re a bioinformatician or analyse CERN data, are a workflow developer, or HPC admin, this talk offers a look at the messy, human side of making reproducibility work in real-world HPC landscapes.
Speakers
| Christian Meesters |