A Go library of probabilistic data structures for processing continuous, unbounded data streams.
Boom Filters is a Go library implementing a suite of probabilistic data structures for processing continuous, unbounded data streams. It solves problems like deduplication, cardinality estimation, and frequency counting in scenarios where data size is unknown or too large for exact tracking. The library includes variants like Stable Bloom Filters, Scalable Bloom Filters, HyperLogLog, and Count-Min Sketch.
Go developers building systems that process high-volume, unbounded data streams, such as event processing pipelines, real-time analytics, monitoring tools, or databases needing approximate set membership and frequency queries.
Developers choose Boom Filters for its comprehensive, production-ready implementations of probabilistic structures in Go, offering memory efficiency and configurable accuracy for stream processing without requiring prior knowledge of data set size.
Probabilistic data structures for processing continuous, unbounded streams.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Includes implementations for Stable Bloom Filters, Scalable Bloom Filters, HyperLogLog, and more, covering most probabilistic data needs in stream processing as detailed in the README.
Designed for unbounded data streams with configurable memory usage, such as Stable Bloom Filters evicting stale data to maintain stable false-positive rates.
Allows setting parameters like false-positive rates and error bounds, enabling developers to balance accuracy, memory, and performance based on specific use cases.
Provides clear usage examples, serialization support for structures like HyperLogLog, and references to academic papers, facilitating real-world adoption.
With multiple filter types and variants, developers must deeply understand trade-offs between false positives, negatives, and memory, which can be overwhelming.
All structures have risks of false positives or negatives, making them unsuitable for critical applications where exactness is required, as admitted in the README.
Limited to the Go ecosystem, so it's not usable in projects with other programming languages or needing cross-platform compatibility.