FOSDEM 2026
/
Schedule
/
Events
/
Developer rooms
/
Search
/
Implementing Block-Max Pruning in Rust: Faster Learned Sparse Retrieval for Modern Search

Implementing Block-Max Pruning in Rust: Faster Learned Sparse Retrieval for Modern Search

Track: Search
Room: UB4.136
Day: Sunday
Start: 13:20
End: 13:50
Video only: ub4136
Chat: Join the conversation!

Learned sparse retrieval models such as SPLADE, uniCOIL, and other transformer-based sparse encoders have become popular for delivering neural-level relevance while preserving the efficiency of inverted indexes. But these models also produce indexes with statistical properties radically different from classic BM25: longer queries, compressed vocabularies, and posting lists with unusual score distributions. As a result, traditional dynamic pruning algorithms like WAND and Block-Max WAND often fail to exploit their full potential.

This talk presents Block-Max Pruning (BMP) from a systems and Rust-engineering perspective. We will walk through how BMP restructures query processing by partitioning document space into small, contiguous blocks and maintaining lightweight, on-the-fly score upper bounds that guide safe or approximate early termination.

The talk is aimed at developers building retrieval engines, Rust-based data systems, or ML-powered search pipelines who want to push sparse retrieval performance further. Attendees will leave with a clear understanding of how BMP works, why learned sparse models require new pruning strategies, and how to integrate these ideas into modern, high-performance Rust codebases.

Code and resources: BMP GitHub repository: https://github.com/pisa-engine/BMP/ Paper (SIGIR 2024): https://www.antoniomallia.it/uploads/SIGIR24.pdf

Speakers

	Ferdinand Schlatt
	Antonio Mallia

fosdem-2026

Brussels / 31 January & 1 February 2026

Implementing Block-Max Pruning in Rust: Faster Learned Sparse Retrieval for Modern Search

Speakers

Links