Brussels / 4 & 5 February 2023

schedule

HPC, Big Data and Data Science devroom


09 10 11 12 13 14 15 16 17 18
Sunday Efficiently exploit HPC resources in scientific analysis and visualization with ParaView Simplifying the creation of Slurm client environments
A Straw for your Slurm beverage
Troika: Submit, monitor, and interrupt jobs on any HPC system with the same interface Self-service Kubernetes Platforms with RDMA on OpenStack
K8s, OpenStack and RDMA are just like oil, vinegar and bread?
How to deal with validation as an HPC software?
An approach to power software testing at scale
LOFAR: FOSS HPC across 2000 kilometers
The unknown world of open source radio astronomy software
HPC Container Conformance
Guidance on how to build and annotate containers for HPC
The LDBC benchmark suite Multiple Double Arithmetic on Graphics Processing Units
GPU acceleration to offset the cost overhead of multiple double arithmetic
Overengineering an ML pet project to learn about MLOps
Force yourself to do pushups while working from home!
Reproducibility and performance: why choose?
CPU tuning in GNU Guix
LIBRSB: Universal Sparse BLAS Library
A highly interoperable Library for Sparse Basic Linear Algebra Subroutines and more for Multicore CPUs
numba-mpi
Numba @njittable MPI wrappers tested on Linux, macOS and Windows
Running MPI applications on Toro unikernel MUST: Compiler-aided MPI correctness checking with TypeART Link-time Call Graph Analysis to facilitate user-guided program instrumentation
An LLVM based approach
How the Spack package manager tames the stat storm Keeping the HPC ecosystem working with Spack CI Developing effective testing pipelines for HPC applications

Read the Call for Papers at https://hpc-bigdata-fosdem23.github.io/.

High Performance Computing (HPC) and Big Data are two important approaches to scientific computing. HPC typically deals with smaller, highly structured data sets and huge amounts of computation while Big Data, not surprisingly, deals with gigantic, unstructured data sets and focuses on the I/O bottlenecks. With the Big Data trend unlocking access to an unprecedented amount of data, Data Science has emerged to tackle the problem of creating processes and approaches to extracting knowledge or insights from these data sets. Machine learning and predictive analytics algorithms have joined the family of more traditional HPC algorithms and are pushing the requirements of cluster and data scalability.

Free and Open Source communities have been the foundation of the HPC and Big Data communities for some time. In the HPC community, it should be no surprise that currently 100% of the Top500 supercomputers in the world run (some variant of) Linux. On the Big Data side, the Hadoop ecosystem has had a tremendous amount of Open Source contributions from a wide range of organizations coming together under the Apache Software Foundation.

Our goal is to bring the communities together, share expertise, learn how we can benefit from each other's work and foster further joint research and collaboration. We welcome talks about Free and Open Source solutions to the challenges presented by large scale computing, data management and data analysis.

Event Speakers Start End

Sunday

  Efficiently exploit HPC resources in scientific analysis and visualization with ParaView Nicolas Vuaille 09:00 09:25
  Simplifying the creation of Slurm client environments
A Straw for your Slurm beverage
Pablo Llopis Sanmillan 09:30 09:55
  Troika: Submit, monitor, and interrupt jobs on any HPC system with the same interface Olivier Iffrig, Axel Bonet 10:00 10:25
  Self-service Kubernetes Platforms with RDMA on OpenStack
K8s, OpenStack and RDMA are just like oil, vinegar and bread?
John Garbutt 10:30 10:55
  How to deal with validation as an HPC software?
An approach to power software testing at scale
Julien Adam 11:00 11:25
  LOFAR: FOSS HPC across 2000 kilometers
The unknown world of open source radio astronomy software
Corne Lukken 11:30 11:55
  HPC Container Conformance
Guidance on how to build and annotate containers for HPC
Christian Kniep 12:00 12:10
  The LDBC benchmark suite Gabor Szarnyas, David Püroja 12:10 12:20
  Multiple Double Arithmetic on Graphics Processing Units
GPU acceleration to offset the cost overhead of multiple double arithmetic
Jan Verschelde 12:25 12:35
  Overengineering an ML pet project to learn about MLOps
Force yourself to do pushups while working from home!
Victor Sonck 12:35 12:45
  Reproducibility and performance: why choose?
CPU tuning in GNU Guix
Ludovic Courtès 12:50 13:00
  LIBRSB: Universal Sparse BLAS Library
A highly interoperable Library for Sparse Basic Linear Algebra Subroutines and more for Multicore CPUs
Michele Martone 13:00 13:25
  numba-mpi
Numba @njittable MPI wrappers tested on Linux, macOS and Windows
Sylwester Arabas, Oleksii Bulenok, Kacper Derlatka 13:30 13:55
  Running MPI applications on Toro unikernel Matias Vara 14:00 14:25
  MUST: Compiler-aided MPI correctness checking with TypeART Alexander Hück 14:30 14:55
  Link-time Call Graph Analysis to facilitate user-guided program instrumentation
An LLVM based approach
Tim Heldmann, Sebastian Kreutzer 15:00 15:25
  How the Spack package manager tames the stat storm Harmen Stoppels 15:30 15:55
  Keeping the HPC ecosystem working with Spack CI Todd Gamblin 16:00 16:25
  Developing effective testing pipelines for HPC applications Jason Nucciarone 16:30 16:55