So you want to do RDMA programming? RTRS: An easy to use, reliable high speed transport library over RDMA
- Track: Network
- Room: H.1302 (Depage)
- Day: Saturday
- Start: 14:30
- End: 14:50
- Video only: h1302
- Chat: Join the conversation!
Description
- RDMA programming is comparatively complex to something like sockets.
- RDMA is the industry standard for data centers and high-performance computing (HPC) environments.
- RTRS is a reliable high speed transport library, which provides a simple interface to perform RDMA. It is a stable, and proven transport library, running on more than 5000 servers across our data centers.
- RTRS establishes a stateful session which provides features like multipath, heartbeats, reusability, etc.
- It creates an optimal number of connections based on the number of CPUs, and uses IRQ pinning for data transfers.
- It allows users to send and receive data in the form of sg lists.
- RTRS is multipath capable (with different policies to choose from) and provides I/O fail-over and load-balancing functionality.
- RTRS pre-allocates and pre-maps DMA buffers on the server side to speed up data paths.
Benefit to the ecosystem
- An easy to use, reliable and stable RDMA transport library to build any kind of module upon. RTRS will provide an entry point for newcomers to RDMA.
- The pre mapping abilities have use-cases in high performance use cases like ML and AI training.
Link to the module
https://elixir.bootlin.com/linux/v6.17.7/source/drivers/infiniband/ulp/rtrs
Speakers
| Haris |