FOSDEM 2025
/
Schedule
/
Events
/
Developer rooms
/
Low-level AI Engineering and Hacking
/
How Llamagator helps to implement LLM-as-a-Judge concept on your local machine

How Llamagator helps to implement LLM-as-a-Judge concept on your local machine

Track: Low-level AI Engineering and Hacking
Room: UB2.252A (Lameere)
Day: Sunday
Start: 11:55
End: 12:00
Video only: ub2252a
Chat: Join the conversation!

In this talk, I explore how the landscape of large language model (LLM) accessibility has shifted dramatically.

It is now possible to run these powerful models locally, right on your laptop, eliminating the need for cloud-based solutions like OpenAI. Previously, the sheer size of LLMs, requiring massive GPUs and RAM, made local deployment impossible for most developers. This reliance on cloud services limited experimentation, customization, and affordability.

My presentation focuses on Llama.cpp, an inference engine enabling efficient execution of LLMs, including Meta's Llama, Qwen, and Mistral models, on CPUs.

I detail the process of acquiring, building, and quantizing models for local use, showcasing how Ruby bindings and a built-in HTTP server simplify interaction. I also introduce two open-source tools I've created: Llamagator and Rspec-Llama.

Llamagator is a Rails application, streamlines the management, testing, and comparison of various LLMs, both local and cloud-based. With it, you can create prompts, define assertions, and evaluate model performance and easily implement pattern of LLM-as-a-judge.

Rspec-Llama extends Rspec with a specialized DSL for interacting with and validating responses from LLMs, making it easier than ever to integrate these models into testing workflows. These tools, combined with the ability to run LLMs locally, empower developers to explore AI's potential without relying on external providers.

Speakers

Sergy Sergyenko

fosdem-2025

Brussels / 1 & 2 February 2025

How Llamagator helps to implement LLM-as-a-Judge concept on your local machine

Speakers

Links