FOSDEM 2017
/
Schedule
/
Events
/
Developer rooms
/
HPC, Big Data and Data Science
/
Dask - extending Python data tools for parallel and distributed computing

Dask - extending Python data tools for parallel and distributed computing

Track: HPC, Big Data and Data Science devroom
Room: H.2213
Day: Saturday
Start: 13:00
End: 13:25

The growing Python data science ecosystem, including the foundational packages Numpy and Pandas, provides powerful tools for data analysis that are widely used in a variety of applications. Typically, these libraries were designed for data that fits in memory and for computations that run on a single core.

Dask is a Python library for parallel and distributed computing, using blocked algorithms and task scheduling. By leveraging the existing Python data ecosystem, Dask enables to compute on arrays and dataframes that are larger than memory, while exploiting parallelism or distributed computing power, but in a familiar interface (mirroring Numpy arrays and Pandas dataframes).

Slides: https://jorisvandenbossche.github.io/talks/2017FOSDEMdask/

Speakers

Joris Van den Bossche

FOSDEM17

Brussels / 4 & 5 February 2017

Dask - extending Python data tools for parallel and distributed computing

Speakers

Attachments

Links

FOSDEM

This year

Practical information

Media and press