FOSDEM 2025
/
Schedule
/
Events
/
Developer rooms
/
Low-level AI Engineering and Hacking
/
RamaLama: Making working with AI Models Boring

RamaLama: Making working with AI Models Boring

Track: Low-level AI Engineering and Hacking
Room: UB2.252A (Lameere)
Day: Sunday
Start: 13:00
End: 13:20
Video only: ub2252a
Chat: Join the conversation!

Managing and deploying AI models can often require extensive system configuration and complex software dependencies. RamaLama, a new open-source tool, aims to make working with AI models straightforward by leveraging container technology, making the process "boring"—predictable, reliable, and easy to manage. RamaLama integrates with container engines like Podman and Docker to deploy AI models within containers, eliminating the need for manual configuration and ensuring optimal setup for both CPU and GPU systems.

This talk will introduce RamaLama’s key features, including support for multiple AI model registries (Ollama, Hugging Face, and OCI), simplified commands for running models as chatbots or REST API services, and compatibility with alternative AI runtimes like llama.cpp and vllm. We’ll explore RamaLama’s unique capabilities, such as generating Podman quadlet files for edge deployments and Kubernetes YAML for scalable deployment, demonstrating how it allows developers to seamlessly transition from local experimentation to production. Join us to learn how RamaLama enables frictionless, containerized AI model deployment for developers and system administrators alike.

Speakers

Eric Curtin

fosdem-2025

Brussels / 1 & 2 February 2025

RamaLama: Making working with AI Models Boring

Speakers

Links