Brussels / 1 & 2 February 2025

schedule

Low-level AI Engineering and Hacking


09 10 11 12 13 14 15 16 17 18
Sunday Hugging Face ecosystem for Local AI/ ML
The Local AI Rebellion
ZML: A High-Performance AI Inference Stack Built for Production and Multi-Accelerator Deployment
History and advances of quantization in llama.cpp
quantizing your GGUF models using iterative refinement of the importance matrix
Apache Arrow: The Great Library Unifier
Bringing AI to Wearable Systems: Integrating Vision, Audio, and Sensors on Constrained Hardware
How Llamagator helps to implement LLM-as-a-Judge concept on your local machine
Compositional LLMs for Assisted competitive coding
The Model Openness Framework (MOF)
Building AI Applications on Kubernetes: Leveraging Instructlab and the Bee Agent Framework
GPUStack: Building a Simple and Scalable Management Experience for Diverse AI Models
Self-hosted LLMs at a scale with Paddler
RamaLama: Making working with AI Models Boring
Building AI Applications from your desktop with Podman AI Lab
From Supercomputer to Raspberry Pi: Building Open Source Polish Language Models
Tricks Learned from Training Large Open-Source Models
Synthetic Data: The Secret Ingredient in Better Language Models
LLM Tool use in vLLM
Scoping out the Tenstorrent Wormhole
Building a new GGML backend: How, Challenges and Oppertunities with Novel Accelerators
Porting GGML to the NUX Kernel Development Framework.
Accelerating AI with open source hardware and software
The bare metal perspective on AMD's GPU ASICs
wllama: bringing llama.cpp to the web
Milliwatt sized Machine Learning on microcontrollers with emlearn

Read the Call for Papers at https://aifoundry.org/fosdem-2025-low-level-ai-engineering-hacking-dev-room.

Event Speakers Start End

Sunday

  Hugging Face ecosystem for Local AI/ ML
VB 09:05 09:30
  The Local AI Rebellion
Justine Tunney 09:30 10:00
  ZML: A High-Performance AI Inference Stack Built for Production and Multi-Accelerator Deployment
Rene Schallner, Guillaume Wenzek 10:00 10:30
  History and advances of quantization in llama.cpp
Tanya Dadasheva, Iwan Kawrakow 10:30 11:00
  quantizing your GGUF models using iterative refinement of the importance matrix
Robert Collins 11:00 11:20
  Apache Arrow: The Great Library Unifier
Matthew Topol 11:20 11:50
  Bringing AI to Wearable Systems: Integrating Vision, Audio, and Sensors on Constrained Hardware
Kris Kersey 11:50 11:55
  How Llamagator helps to implement LLM-as-a-Judge concept on your local machine
Sergy Sergyenko 11:55 12:00
  Compositional LLMs for Assisted competitive coding
Ritwik Agarwal 12:00 12:05
  The Model Openness Framework (MOF)
Arnaud Le Hors 12:05 12:10
  Building AI Applications on Kubernetes: Leveraging Instructlab and the Bee Agent Framework
Martin Hickey, Paul Schweigert 12:10 12:15
  GPUStack: Building a Simple and Scalable Management Experience for Diverse AI Models
Lawrence Li 12:20 12:40
  Self-hosted LLMs at a scale with Paddler
Mateusz Charytoniuk 12:40 13:00
  RamaLama: Making working with AI Models Boring
Eric Curtin 13:00 13:20
  Building AI Applications from your desktop with Podman AI Lab
Cedric Clyburn, Stevan Le Meur 13:20 13:40
  From Supercomputer to Raspberry Pi: Building Open Source Polish Language Models
Bielik Team, Maciej, Pawel Cyrta, Adrian 13:40 13:55
  Tricks Learned from Training Large Open-Source Models
Marcus Edel 13:55 14:10
  Synthetic Data: The Secret Ingredient in Better Language Models
Carol Chen, Cedric Clyburn 14:10 14:25
  LLM Tool use in vLLM
Max de Bayser 14:25 14:40
  Scoping out the Tenstorrent Wormhole
Peter Cawley 14:40 15:00
  Building a new GGML backend: How, Challenges and Oppertunities with Novel Accelerators
Martin Chang 15:00 15:20
  Porting GGML to the NUX Kernel Development Framework.
Gianluca Guida 15:20 15:40
  Accelerating AI with open source hardware and software
William Jones 15:40 16:00
  The bare metal perspective on AMD's GPU ASICs
Jon Chesterfield 16:00 16:20
  wllama: bringing llama.cpp to the web
Xuan-Son Nguyen 16:20 16:40
  Milliwatt sized Machine Learning on microcontrollers with emlearn
Jon Nordby 16:40 17:00