Graph pattern matching is one of the most interesting and challenging operations in graph analytics. However, it is primarily supported by graph database systems such as Neo4j but, besides research prototypes, not generally available for distributed (not-only graph) processing frameworks like Apache Flink or Apache Spark.
In our talk, we want to give an overview of our current implementation of Cypher on Apache Flink. Cypher is the Neo4j graph query language and enables the intuitive definition of graph patterns including structural and semantic predicates. As the Neo4j graph data model is not supported out-of-the box by Apache Flink, we leverage Gradoop, a Flink-based graph analytics framework based on Apache Flink that already provides an abstraction of schema-free property graphs.
We will give a brief overview about the technologies used to implement Cypher, explain our query engine and give a demonstration of the available language features. Finally, we will discuss open challenges and missing features hopefully motivating people to contribute.
The project is a cooperation between the University of Leipzig and Neo Technology.
Intended audience and goal of the talk
Developers and analysts, trying to express graph queries on distributed graphs
Links to background information on the given talk for the hungry and impatient
Links to previous talks, code snippets or repositories