FOSDEM 2025
/
Schedule
/
Events
/
Developer rooms
/
Low-level AI Engineering and Hacking
/
wllama: bringing llama.cpp to the web

wllama: bringing llama.cpp to the web

Track: Low-level AI Engineering and Hacking
Room: UB2.252A (Lameere)
Day: Sunday
Start: 16:20
End: 16:40
Video only: ub2252a
Chat: Join the conversation!

As one of the main contributor of the llama.cpp project, I’ve explored ways to bring its capabilities to the web through WebAssembly, creating a frontend solution for on-device inference without the need for servers or external APIs. This talk shares my journey in implementing wllama, a lightweight TypeScript/JavaScript library designed to push llama.cpp’s limits in a web context. I’ll discuss my motivations, the implementation details, the challenges faced, and the future roadmap, offering insights into the technical and creative decisions behind the project.

Speakers

Xuan-Son Nguyen

fosdem-2025

Brussels / 1 & 2 February 2025

wllama: bringing llama.cpp to the web

Speakers

Links