Brussels / 3 & 4 February 2024

schedule

`New` Workflow Orchestrator in town: "Apache Airflow 2.x"


Efficient and well managed Orchestrating your Data processing Pipelines is crucial to provide the back-bone of the modern Data processing needs. With all the new rage of LLMS and AI, it's more important than ever to make sure that you can wrap your head around all the old, new and upcoming data processing tools and services you use, but also track the provenance, lineage of the data, and make it easy to author and mange those for distributed teams.

This talk will bring to the light how modern Airflow 2.7+ provides proven and also modernised, easy nd nice to use ways on how to do it. Quite often you hear about the "new" orchestrator that aims to solve your orchestration needs. You can also often hear how it compares to Airlfow. However those comparisions often overlook the fact that since Airflow 2.0 has been introduced, it continues to evolve and piece-by-piece modernize itself and respond to the needs of processing even more data, and interacting with even more systems that were not even existing a year ago (think LLMs) - while continue harnessing the powers of other tools you already used in the past and bind them all together.

New UI, New ways of writing your orchestration tasks, new ways to test them, tracking lineage, simpler authoring and intracting with object storages. And the comparision often overlook that if you start your journey with Airflow today, your experience will be quite a bit different than even 2 years ago (what usually most comparisions talk about). This talk will highlight some of the important ones.

Also you might be surprised but this is all happening without breaking compatibility - hence you can still you use it in the "old way". But maybe it's time to learn new ways?

Speakers

Photo of Jarek Potiuk Jarek Potiuk

Links