Enabling Intelligent Media Playback on RISC-V: VLC with Whisper STT and Qwen T2T on Next-Gen RISC-V AI PCs
- Track: Open Media devroom
- Room: K.4.601
- Day: Saturday
- Start: 13:05
- End: 13:25
- Video only: k4601
- Chat: Join the conversation!
This joint talk by DeepComputing and contributors from the VLC project showcases how intelligent media playback and real-time audio processing are becoming a reality on open RISC-V hardware. We demonstrate VLC running Whisper (speech-to-text) and Qwen (text-to-text LLM) on ESWIN’s EIC7702 SoC with a 40-TOPS NPU, achieving practical AI-enhanced multimedia performance entirely on RISC-V. We will walk through the porting process, performance tuning across CPU/NPU, audio pipeline integration, and the technical challenges of enabling real-time inference on today’s RISC-V AI PCs. The session will also preview our upcoming 16-core RISC-V platform and discuss how VLC’s evolving AI support roadmap aligns with this next generation of RISC-V hardware. Together, we outline the upstreaming efforts required to bring AI-accelerated playback, real-time captioning, translation, and other intelligent media features to the broader open-source community.
Speakers
| Jean Baptiste Kempf | |
| Yuning Liang |