Online / 5 & 6 February 2022


How to create (lots!) of sample time-series data with PostgreSQL generate_series()

Exploring new features in PostgreSQL or reproducing an unusual query plan can be tricky without representative data to utilize. While there are a plethora of sources for sample data and tools to import it, you can end up spending too much time finding representative data to work with. In our day-to-day work at Timescale, we often need to quickly create lots of sample time-series data to demonstrate new features, run a benchmark, or help community members with examples as they learn.

Although using real application data would be ideal, PostgreSQL provides the generate_series() function which makes it easy to create a representative time-series dataset using varying cardinalities and different lengths of time.

In this talk we'll introduce generate_series() and demonstrate how to use it to create realistic-looking time-series data of all shapes and sizes, using custom PostgreSQL user-defined functions. Once we've mastered the basics, we'll dial it up a notch by incorporating PostgreSQL math functions and relational data to create realistic time-series patterns of data for various use cases like sales or website visits.


Photo of Ryan Booz Ryan Booz