Brussels / 3 & 4 February 2024

schedule

Kùzu: A Graph Database Management System for Python Graph Data Science


This talk presents Kùzu: a new open-sourced graph database management system (GDBMS) that is designed for graph data science (GDS) eco-system, specifically in Python. GDS applications require a series of data processing steps, such as extracting data from tabular sources into a graph of nodes and relationships, cleaning and transforming the graph, extracting node features, and finally moving data into a GDS package, such as NetworkX and PyTorch Geometric for graph analytics. These steps can be performed easily and efficiently by GDBMSs, which provide high-level graph-based data models and query languages to developers. Kùzu is a GDBMS designed to serve as an essential storage system for GDS developers.

Kùzu's embedded architecture makes it very easy to import as a library without a server setup and also provides performance advantages. Specifically users can: (i) ingest and model their application records in various raw file formats, such as Parquet or in-memory Pandas DataFrames, as a graph; (ii) query and transform these graphs using Cypher query language; and (iii) export graphs into popular Python GDS packages like NetworkX and PyTorch Geometric with no copy cost.

The talk is tailored for data scientists and engineers. We will briefly provide the necessary background on graph analytics. We'll briefly walk through code examples showcasing how Kùzu makes developing GDS pipelines easier, via its integrations with the PyData ecosystem.

Speakers

Photo of Semih Salihoglu Semih Salihoglu
Photo of Prashanth Rao Prashanth Rao

Attachments

Links