Brussels / 2 & 3 February 2019


MALT, A Malloc Tracker

In HPC the memory available is still growing a lot with now soon TB of memory on one server. This is means more stress for the allocator and to underlying OS. This lead to more performance issue and mistakes on memory amangement handling in large applications.

Memory missusage is also an issue for more common application like desktop applications with large code base.

MALT is a memory profiling tool dedicated to memory management to provide temporal charts, global metrics and source code annotations. It comes with a nice web graphical interface to dig into the profile.

I made my PhD. developing a memory allocator for HPC application on large scale supercomputers. During this period I observed a lot of unperformant patterns and issues on existing HPC app. Mostly on a multi-million line simulation and worked a lot to support this in my memory allocator to improve things.

During my postdoc at the Exascale Computing Research Lab, I focused on implementing a memory profiler trying to show the user what the app is doing with the allocator and trying to provide ways to observe the common issues an mistakes in large scale apps.

It also uses an uncommon approach for such tools in HPC as it provides a web-based interface using tools like D3JS/Bootstrap/Angular and exposed by a small nodejs webserver. It permits to fix a big issue in HPC when running remotely as the GUI of the profiler needs to be X-forwared which make it slow, badly themed. Or to run locally without having our source code at the same place. The web server permit to easily ssh-port-forward the interface and eventually to work at many people remotely looking at the same profile. This also provides quickly a nicer rendering with less development overhead.

I used it a lot on my own development at CERN for a code scaling on 500 nodes and also shortly made a try on a large scale ~1.5 million line C++ application used by physicists there to check it didn't crash at this challenging scale. The tool currently limits himself to process analysis which means for an MPI app it dumps a profile file for each rank. But this is what is meaningful for memory management as a first step.


Photo of S├ębastien Valat S├ębastien Valat