FOSDEM 2025
/
Schedule
/
Events
/
Developer rooms
/
Real Time Communications (RTC)
/
A long, short history of realtime AI agents

A long, short history of realtime AI agents

Track: Real Time Communications (RTC)
Room: K.3.601
Day: Saturday
Start: 17:10
End: 17:25
Video only: k3601
Chat: Join the conversation!

Until a few months ago, the only working approach for connecting realtime AI agents to WebRTC streams and phone calls was to use lengthy pipelines of speech to text, agent orchestration, and text to speech, often using multiple machine learning models from commercial vendors. That has changed with new realtime speech to speech models, most famously the (closed) OpenAI advanced voice, but what are the open source ways to build these kind of systems? This talk walks through my experience with using 4 different projects to build functional systems which can use open source (open weights) models at their core. We will talk about how we have integrated Jambonz, Livekit, and Ultravox (Fixie.AI) within our Aplisay framework and what this allows us to do.

Speakers

Rob Pickering

Attachments

Talk pdf

fosdem-2025

Brussels / 1 & 2 February 2025

A long, short history of realtime AI agents

Speakers

Attachments

Links