A long, short history of realtime AI agents
- Track: Real Time Communications (RTC)
- Room: K.3.601
- Day: Saturday
- Start: 17:10
- End: 17:25
- Video only: k3601
- Chat: Join the conversation!
Until a few months ago, the only working approach for connecting realtime AI agents to WebRTC streams and phone calls was to use lengthy pipelines of speech to text, agent orchestration, and text to speech, often using multiple machine learning models from commercial vendors. That has changed with new realtime speech to speech models, most famously the (closed) OpenAI advanced voice, but what are the open source ways to build these kind of systems? This talk walks through my experience with using 4 different projects to build functional systems which can use open source (open weights) models at their core. We will talk about how we have integrated Jambonz, Livekit, and Ultravox (Fixie.AI) within our Aplisay framework and what this allows us to do.
Speakers
![]() |
Rob Pickering |
Attachments
Links
- Live interactive talk facilitated by WebRTC AI
- Video recording (AV1/WebM) - 31.3 MB
- Video recording (MP4) - 242.9 MB
- Video recording subtitle file (VTT)
- The talk recording didn't work out to great, so here is a bit more info about the session, including a link to the self paced presentation
- Chat room(web)
- Chat room(app)
- Submit Feedback