Brussels / 3 & 4 February 2024


Semantically-driven data management solution for I/O intensive HPC workflows

Semantically-driven data management is an approach that focuses on the meaning and context of the data, rather than simply storing it. The semantic information can be used to drive workflows. The storage and retrieval of data can be optimized and specialised based on data semantics for different workflows.

DASI (Data Access and Storage Interface) is a semantically-driven data store developed by ECMWF as part of the EuroHPC project IO-SEA. Based on the FDB object store library developed and in operational use at ECMWF, DASI manages data using its domain specific and scientifically meaningful metadata keys, and separates data management from the underlying backend storage technologies. This allows very fast and efficient algorithms to search for and retrieve data (based on its semantic meaning) from large datasets. DASI is modular and is compatible with multiple backends (e.g., POSIX, CEPH, DAOS) through a diverse API (Python, C++, C). We will explain the concept of semantic description of data and demonstrate DASI as a data management solution.


Photo of Metin Cakircali Metin Cakircali