Brussels / 4 & 5 February 2017

schedule

From pipelines to graphs

Escape the tyranny of the shell’s linear pipelines with dgsh


The Unix dgsh shell provides an expressive way to construct sophisticated and efficient data processing pipelines using standard Unix tools, as well as third-party and custom-built components. Dgsh allows the specification of pipelines of non-uniform non-linear operations. For example tee can feed three processes whose output can then be collected by paste. The pipelines form a directed acyclic process graph, which is typically executed by multiple processor cores, thus increasing the task's processing throughput. We will see how to use dgsh in practice through a number of general data processing and domain-specific examples, and how to adapt tools for use with dgsh.

Speakers

Photo of Diomidis Spinellis Diomidis Spinellis

Links