Binary fuse filters: Fast and smaller than xor filters (2022)
Recorded: Jan. 22, 2026, 11:03 a.m.
| Original | Summarized |
[2201.01174] Binary Fuse Filters: Fast and Smaller Than Xor Filters
Skip to main content We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. > cs > arXiv:2201.01174 Help | Advanced Search All fields Search open search GO open navigation menu quick links Login Computer Science > Data Structures and Algorithms arXiv:2201.01174 (cs) [Submitted on 4 Jan 2022] Abstract:Bloom and cuckoo filters provide fast approximate set membership while using little memory. Engineers use them to avoid expensive disk and network accesses. The recently introduced xor filters can be faster and smaller than Bloom and cuckoo filters. The xor filters are within 23% of the theoretical lower bound in storage as opposed to 44% for Bloom filters. Inspired by Dietzfelbinger and Walzer, we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound. We compare the performance against a wide range of competitive alternatives such as Bloom filters, blocked Bloom filters, vector quotient filters, cuckoo filters, and the recent ribbon filters. Our experiments suggest that binary fuse filters are superior to xor filters. Subjects: Data Structures and Algorithms (cs.DS) Cite as: Focus to learn more arXiv-issued DOI via DataCite Related DOI: Focus to learn more DOI(s) linking to related resources Submission history From: Daniel Lemire [view email] [v1]
Full-text links: View a PDF of the paper titled Binary Fuse Filters: Fast and Smaller Than Xor Filters, by Thomas Mueller Graf and Daniel LemireView PDFTeX Source view license < prev | new Change to browse by: References & Citations NASA ADSGoogle Scholar DBLP - CS Bibliography listing | bibtex Daniel Lemire export BibTeX citation BibTeX formatted citation loading... Data provided by: Bookmark
Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Code, Data and Media Associated with this Article alphaXiv Toggle alphaXiv (What is alphaXiv?) Links to Code Toggle CatalyzeX Code Finder for Papers (What is CatalyzeX?) DagsHub Toggle DagsHub (What is DagsHub?) GotitPub Toggle Gotit.pub (What is GotitPub?) Huggingface Toggle Hugging Face (What is Huggingface?) Links to Code Toggle Papers with Code (What is Papers with Code?) ScienceCast Toggle ScienceCast (What is ScienceCast?) Demos Demos Replicate Toggle Replicate (What is Replicate?) Spaces Toggle Hugging Face Spaces (What is Spaces?) Spaces Toggle TXYZ.AI (What is TXYZ.AI?) Related Papers Recommenders and Search Tools Link to Influence Flower Influence Flower (What are Influence Flowers?) Core recommender toggle CORE Recommender (What is CORE?) Author About arXivLabs arXivLabs: experimental projects with community collaborators Which authors of this paper are endorsers? | About contact arXivClick here to contact arXiv subscribe to arXiv mailingsClick here to subscribe Copyright Web Accessibility Assistance arXiv Operational Status |
Thomas Mueller Graf and Daniel Lemire introduce a novel approach to probabilistic data structures known as binary fuse filters in their paper, “Binary Fuse Filters: Fast and Smaller Than Xor Filters” (arXiv:2201.01174). The research builds upon previously established methods, specifically referencing Dietzfelbinger and Walzer’s work, to develop filters that achieve significantly improved storage efficiency compared to existing alternatives. The core innovation lies in achieving a storage lower bound of approximately 13%, a substantial reduction compared to the 44% lower bound observed in Bloom filters and the 23% lower bound of xor filters. Crucially, the construction process for binary fuse filters is notably faster, exceeding twice the speed of constructing xor filters. To further enhance their performance, the authors deliberately sacrifice query speed, bringing the storage efficiency down to approximately 8% of the theoretical lower bound. This trade-off allows binary fuse filters to outperform xor filters across a range of comparative experiments. The research thoroughly evaluated the filters against several competing structures, including Bloom filters, blocked Bloom filters, vector quotient filters, cuckoo filters, and ribbon filters. The experimental results decisively indicate that binary fuse filters represent a superior design. The paper highlights a deliberate design choice, balancing storage efficiency with query speed. This reveals a strategic approach within the development of these probabilistic filters, prioritizing a particular metric for optimization. |