Brussels / 1 & 2 February 2025

schedule

wllama: bringing llama.cpp to the web


As one of the main contributor of the llama.cpp project, I’ve explored ways to bring its capabilities to the web through WebAssembly, creating a frontend solution for on-device inference without the need for servers or external APIs. This talk shares my journey in implementing wllama, a lightweight TypeScript/JavaScript library designed to push llama.cpp’s limits in a web context. I’ll discuss my motivations, the implementation details, the challenges faced, and the future roadmap, offering insights into the technical and creative decisions behind the project.

Speakers

Photo of Xuan-Son Nguyen Xuan-Son Nguyen