Warp10: A new paradigm for Time Series analysis
IoT changed deeply the value chain. End users of consumer devices aim to have an instant gratification which is based on data/metrics produced by the object. At the end, IoT makers have to execute an epic split: Build the actual device (mechanical and electronics), Be over the top in firmware development in order to be secured Design services based on the object data. Surrounding all that is a hard job for a product team. In such a context Open Source Software constitutes basic building blocks of IoT devices. Data produced by smart devices are often time series, storing them is not the challenge, many good open source solutions exists (OpenTSDB, Influx..) but developing algorithms based on your data is difficult. It can become a nightmare if you have to think about scalability. Warp10 is a 3 years old open source platform designed for collect, store and manipulate sensor data with WarpScript, a language dedicated to time series analysis. WarpScript works natively on times series stored into Warp 10 (based on either levelDB or HBase) but can be connected to any data source. When you manipulate sensor data, you must deal with privacy. Security and privacy have also been addressed by Warp 10 since its very inception, this includes fine grain access control mechanisms, encryption capabilities and throttling management. Warp10 plateform can be integrated into an open ecosystem likes Storm, Flink, or also Apache Pig with one cornerstone, the capatibility of manipulate time series with WarpScript.
The Warp 10 Platform is designed to collect, store and manipulate sensor data. Sensor data are ingested as sequences of measurements (also called time series). The Warp 10 Platform offers the possibility for each measurement to also have spatial metadata specifying the geographic coordinates and/or the elevation of the sensor at the time of the reading. Those augmented measurements form what we call Geo Time Series.
The first differentiating factor of Warp 10 is that both space (location) and time are considered first class citizens. Working with Geo Time Series allows you to have geo-located readings without having to use four separate series and having to keep track of the reading context.
Complex searches like “find all the sensors active during last Monday in the perimeter delimited by this geo-fencing polygon” can be done without involving expensive joins between separate time series for the same source.
We created WarpScript, an extensible stack oriented programming language which offers more than 700 functions and several high level frameworks to ease and speed your data analysis. Simply create scripts containing your data analysis code and submit them to the platform, they will execute close to where the data resides and you will get the result of that analysis as a JSON object that you can integrate into your application.
The WarpScript approach is another differentiating factor of Warp 10. Traditional time series platforms offer few manipulation options, usually only providing a SQL like query language which cannot express complex analysis, or providing a reduced set of aggregation functions. These approaches force you to produce more code on the client side thus increasing your development time and leading to massive transfers of unprocessed data from the platform to your applications. Our approach lets you focus on your business use cases, simplifying IoT and sensor data applications by taking care of a larger chunk of the data analysis in a very efficient way.