WOODS

A set of Benchmarks for Out-of-Distribution Generalization in Time Series Tasks

Track: HPC, Big Data, and Data Science devroom
Room: D.hpc
Day: Sunday
Start: 16:30
End: 17:00
Video with Q&A: D.hpc
Video only: D.hpc
Chat: Join the conversation!

In the last decade, the field of AI has seen a significant surge in capabilities in machine learning techniques. Nowadays, models with up to billions of parameters are trained for vast arrays of downstream tasks and obtain performance that defies what a lot considered possible 20 years ago. However, the reliance of machine learning models on the spurious correlations often prevents them from learning the intrinsic and invariant features of data, leading to their failure to generalize to Out-Of-Distribution (OOD) data. Understanding and overcoming these failures have led to a research program on OOD generalization.

The field has been extensively explored in the static computer vision tasks (Domainbed, WILDS) but has been severely underexplored for time series tasks essential for multiple areas of applied machine learning, e.g., medical, finance, communication. We propose a set of new open source out-of-distribution generalization datasets for sequential prediction tasks spanning multiple modalities that act as benchmarks for potential algorithms that promote invariant learning. Along with the datasets, we provide a fair and systematic open-source platform for evaluating the performance of existing and potential algorithms on these datasets. We also provide a leaderboard that currently consists of popular algorithms' performance in the field of OOD generalization.