A space-efficient probabilistic data structure for set membership queries that supports dynamic additions and deletions.
Cuckoo Filter is a probabilistic data structure that provides approximated set-membership queries, answering "if item x is in a set?" with a known false positive rate. It improves upon Bloom filters by supporting dynamic addition and deletion of items while maintaining space efficiency, especially at low false positive rates (<3%). The implementation uses cuckoo hashing to store compact fingerprints in buckets.
Developers and engineers building systems that require efficient set membership testing with mutable datasets, such as databases, network routers, caching layers, or applications needing duplicate detection.
It offers a practical alternative to Bloom filters by enabling item deletion without the space overhead of counting Bloom filters, and it can be more compact for low false positive rates, making it ideal for applications where datasets change dynamically.
Cuckoo Filter: Practically Better Than Bloom
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports insertion and deletion of items, unlike standard Bloom filters, without the significant space overhead of counting variants, as highlighted in the README's comparison.
Uses cuckoo hashing to store compact 8-bit fingerprints, consuming less memory than Bloom filters for false positive rates under 3%, per the README's analysis.
Implements the original paper's recommendations with 2 bucket indices and static bucket sizes of 4 fingerprints, ensuring balanced trade-offs for common use cases.
Offers a clean Go API with full documentation on GoDoc, making it straightforward to use in Go projects with methods like InsertUnique and Delete.
Uses a hardcoded 8-bit fingerprint size, resulting in a ~3% false positive rate that cannot be adjusted without code changes, unlike other implementations with configurable sizes.
This is a Go-only library, so it's unsuitable for projects in other programming languages, limiting its portability and requiring rewrites for multi-language ecosystems.
The README and API do not provide built-in methods for saving or loading filter state, which complicates persistence and sharing across sessions in applications.