Building Cloud Infrastructure for AI
- Track: Virtualization and Cloud Infrastructure
- Room: H.2213
- Day: Saturday
- Start (UTC+1): 14:00
- End (UTC+1): 14:30
- Room livestream: h2213
- Chat: Join the conversation!
"GPU clouds" for AI application are the hot topic at the moment, but often these either end up being just big traditional HPC-style cluster deployments instead of actual cloud infrastructure or are built in secrecy by hyperscalers.
In this talk, we'll explore what makes a "GPU cloud" an actual cloud, how requirements differ from traditional cloud infrastructure, and most importantly, how you can build your own using open source technology - all the way from hardware selection (do you really need to buy the six-figures boxes?) over firmware (OpenBMC), networking (SONiC, VPP), storage (Ceph, SPDK), orchestration (K8s, but not the way you think), OS deployment (mkosi, UEFI HTTP netboot), virtualization (QEMU, vhost-user), performance tuning (NUMA, RDMA) to various managed services (load balancing, API gateways, Slurm etc.)
Speakers
| Dave Hughes | |
| Lukas Stockner |