A high-performance Java implementation of a Cuckoo filter, supporting deletions, counting, and concurrent operations.
CuckooFilter4J is a high-performance Java implementation of a Cuckoo filter, a probabilistic data structure that efficiently tests whether an element is a member of a set. It solves the need for space-efficient approximate membership queries with support for deletions and concurrent operations, addressing limitations of traditional Bloom filters.
Java developers building applications requiring efficient set-membership testing, such as databases, caching systems, network routers, or data-intensive services where low false-positive rates and dynamic updates are critical.
Developers choose CuckooFilter4J for its thread-safe design, ability to delete items without extra space cost, and performance comparable to Guava's Bloom filters, making it a versatile upgrade for multi-threaded environments.
High performance Java implementation of a Cuckoo filter - Apache Licensed
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Allows removal of items without additional space cost, unlike Bloom filters, which typically require counting variants for deletion.
Supports multi-threaded operations out-of-the-box, making it suitable for concurrent applications where libraries like Guava's Bloom filter fail.
Configurable with various hash algorithms, including faster xxHash and secure options like SHA, offering performance and security trade-offs.
Provides an approximate count of item insertions up to a limit, useful for tracking duplicates without the full overhead of counting Bloom filters.
The project is explicitly marked as unmaintained with known bugs, making it unreliable and risky for production use.
The main branch uses object-based operations that create garbage collection pressure, while the faster primitive branch is separate and not integrated.
Multithreading support is labeled as beta with potential undiscovered bugs and deadlocks, posing a risk for high-concurrency applications.
Has a maximum duplicate limit of 8-9, and exceeding this can degrade performance and cause insert failures, requiring careful management.