LmCast :: Stay tuned in

Binary fuse filters: Fast and smaller than xor filters (2022)

Recorded: Jan. 22, 2026, 11:03 a.m.

Original Summarized

[2201.01174] Binary Fuse Filters: Fast and Smaller Than Xor Filters

Skip to main content

We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors.
Donate

> cs > arXiv:2201.01174

Help | Advanced Search

All fields
Title
Author
Abstract
Comments
Journal reference
ACM classification
MSC classification
Report number
arXiv identifier
DOI
ORCID
arXiv author ID
Help pages
Full text

Search

open search

GO

open navigation menu

quick links

Login
Help Pages
About

Computer Science > Data Structures and Algorithms

arXiv:2201.01174 (cs)

[Submitted on 4 Jan 2022]
Title:Binary Fuse Filters: Fast and Smaller Than Xor Filters
Authors:Thomas Mueller Graf, Daniel Lemire View a PDF of the paper titled Binary Fuse Filters: Fast and Smaller Than Xor Filters, by Thomas Mueller Graf and Daniel Lemire
View PDF

Abstract:Bloom and cuckoo filters provide fast approximate set membership while using little memory. Engineers use them to avoid expensive disk and network accesses. The recently introduced xor filters can be faster and smaller than Bloom and cuckoo filters. The xor filters are within 23% of the theoretical lower bound in storage as opposed to 44% for Bloom filters. Inspired by Dietzfelbinger and Walzer, we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound. We compare the performance against a wide range of competitive alternatives such as Bloom filters, blocked Bloom filters, vector quotient filters, cuckoo filters, and the recent ribbon filters. Our experiments suggest that binary fuse filters are superior to xor filters.

Subjects:

Data Structures and Algorithms (cs.DS)

Cite as:
arXiv:2201.01174 [cs.DS]

 
(or
arXiv:2201.01174v1 [cs.DS] for this version)

 
https://doi.org/10.48550/arXiv.2201.01174

Focus to learn more

arXiv-issued DOI via DataCite


Journal reference:
Journal of Experimental Algorithmics 27, 2022

Related DOI:

https://doi.org/10.1145/3510449

Focus to learn more

DOI(s) linking to related resources

Submission history From: Daniel Lemire [view email] [v1]
Tue, 4 Jan 2022 15:05:24 UTC (1,425 KB)

Full-text links:
Access Paper:

View a PDF of the paper titled Binary Fuse Filters: Fast and Smaller Than Xor Filters, by Thomas Mueller Graf and Daniel LemireView PDFTeX Source

view license


Current browse context: cs.DS

< prev

  |  
next >

new
|
recent
| 2022-01

Change to browse by:

cs

References & Citations

NASA ADSGoogle Scholar
Semantic Scholar

DBLP - CS Bibliography

listing | bibtex

Daniel Lemire

export BibTeX citation
Loading...

BibTeX formatted citation
×

loading...

Data provided by:

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

Links to Code Toggle

Papers with Code (What is Papers with Code?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)

Related Papers

Recommenders and Search Tools

Link to Influence Flower

Influence Flower (What are Influence Flowers?)

Core recommender toggle

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

About arXivLabs

arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? |
Disable MathJax (What is MathJax?)

About
Help

contact arXivClick here to contact arXiv
Contact

subscribe to arXiv mailingsClick here to subscribe
Subscribe

Copyright
Privacy Policy

Web Accessibility Assistance

arXiv Operational Status

Thomas Mueller Graf and Daniel Lemire introduce a novel approach to probabilistic data structures known as binary fuse filters in their paper, “Binary Fuse Filters: Fast and Smaller Than Xor Filters” (arXiv:2201.01174). The research builds upon previously established methods, specifically referencing Dietzfelbinger and Walzer’s work, to develop filters that achieve significantly improved storage efficiency compared to existing alternatives. The core innovation lies in achieving a storage lower bound of approximately 13%, a substantial reduction compared to the 44% lower bound observed in Bloom filters and the 23% lower bound of xor filters. Crucially, the construction process for binary fuse filters is notably faster, exceeding twice the speed of constructing xor filters. To further enhance their performance, the authors deliberately sacrifice query speed, bringing the storage efficiency down to approximately 8% of the theoretical lower bound. This trade-off allows binary fuse filters to outperform xor filters across a range of comparative experiments. The research thoroughly evaluated the filters against several competing structures, including Bloom filters, blocked Bloom filters, vector quotient filters, cuckoo filters, and ribbon filters. The experimental results decisively indicate that binary fuse filters represent a superior design. The paper highlights a deliberate design choice, balancing storage efficiency with query speed. This reveals a strategic approach within the development of these probabilistic filters, prioritizing a particular metric for optimization.