Brussels / 3 & 4 February 2018


Babelfish: a universal code parser for source code analysis

As the amount of written code increases exponentially, better and more scaling source code analysis tools are required. Babelfish offers a foundation on which these tools can be built.

Babelfish aims to be a universal code parser that scales. Native parsers provided by each language ecosystem are used to get a native AST. This AST is then converted to a language-independent format that includes universal annotations for language features, which allows both a generic language-independent analysis and a more detailed language-specific analysis.

Language drivers are built on top of containers to allow easy deployment of many different languages in a generic way. An SDK is provided to develop these drivers, providing the annotation language, conversion tools, etc. A server is in charge of handling the different drivers, language detection, scaling the number of running instances, etc. Finally, libraries and clients for different languages are provided to give language analysts the needed foundation to be able to write the code analysis tools they want.

In this talk, we want to explain in more detail the architecture and components of Babelfish to reach both language analysists interested in building tools on top of it and potential contributors to the project.


Photo of Francesc Campoy Francesc Campoy