FOSDEM 2025
/
Schedule
/
Events
/
Developer rooms
/
HPC, Big Data & Data Science
/
OpenCL, CUDA, and HIP as compilation targets for functional array programs

OpenCL, CUDA, and HIP as compilation targets for functional array programs

Track: HPC, Big Data & Data Science
Room: UB5.132
Day: Sunday
Start: 14:00
End: 14:10
Video only: ub5132
Chat: Join the conversation!

OpenCL, CUDA, and HIP are possibly the most popular APIs for low-level GPU programming, and most GPUs support more than one. A lot of superstitition abounds about their relative performance compared to each other, but little data is available, largely because it is very tedious to implement otherwise-equivalent programs using these APIs, in order to compare their performance.

In this presentation I will present my experiences using OpenCL, CUDA, and HIP as compilation targets for Futhark, a functional array language. I look at the performance of OpenCL versus CUDA, and OpenCL versus HIP, on the code generated by the Futhark compiler on a collection of 48 application benchmarks on two different GPUs - probably the largest such comparison done, at least in terms of benchmarks. Despite the generated code in most cases being equivalent, I observe significant performance differences on the same hardware. I can identify the root causes of most of these differences, many of which are due to relatively superficial details such as inconsistent defaults regarding compiler optimisation and numerical accuracy, although a few remain mysterious. The obtained information is useful to anyone who seeks to generate low-level GPU code from higher level specifications or libraries.

Speakers

Troels Henriksen

Attachments

Slides

fosdem-2025

Brussels / 1 & 2 February 2025

OpenCL, CUDA, and HIP as compilation targets for functional array programs

Speakers

Attachments

Links