Brussels / 1 & 2 February 2025

schedule

Apache Arrow tensor arrays: an approach for storing tensor data


This talk introduces Apache Arrow's tensor arrays as a tool for representing an array of tensors in memory, their storage and transportation. We'll introduce the tensor array memory layout specification, its implementation in Arrow C++ and Python, showcasing how it can help interoperate with PyData and database ecosystems.

We'll present the fixed and variable shape tensor array specifications, their implementations and how they can be used to interoperate with Arrow aware ecosystem such as DLPack, NumPy, and others. Further we'll discuss design decisions we made to make the two tensor arrays as generic and universal as possible.

Speakers

Photo of Rok Mihevc Rok Mihevc
Photo of Alenka Alenka