Brussels / 30 & 31 January 2016

schedule

hanythingondemand: easily creating on-the-fly Hadoop clusters (and more) on HPC systems


hanythingondemand (or HOD for short) is a set of scripts to start services, for example a Hadoop cluster, from within another resource management system (e.g., Torque/PBS) on an HPC system. As such, it allows traditional users of HPC systems to experiment with Hadoop and other services, or use it as a production setup if there is no dedicated setup available. Next to Hadoop clusters, HOD can also create HBase databases, IPython notebooks, and set up a Spark environment.

In this talk, we will:

  • motivate the need for a framework like HOD
  • discuss its history (based on 'Hadoop On Demand’)
  • explain how it works
  • showcase several use cases, including:
    • easily creating one or more Hadoop clusters on-the-fly for interactive use
    • running batch scripts on a Hadoop cluster (non-interactively)
    • spawning an IPython notebook with desired resources and connecting to it

HOD is available through https://github.com/hpcugent/hanythingondemand under a GPLv2 license. Detailed documentation is available at http://hod.readthedocs.org

Speakers

Ewan Higgs

Attachments

Links