A high-performance C/C++ library for compressed bitmaps with SIMD optimizations, used in databases like ClickHouse and Apache Doris.
CRoaring is a C/C++ library implementing Roaring bitmaps, a compressed bitmap data structure optimized for high performance and low memory usage. It solves the problem of efficiently storing and querying large sets of integers, commonly needed in database indexing, search engines, and data analytics, by providing fast set operations (union, intersection, difference) with SIMD acceleration.
Systems programmers and database engineers building high-performance data processing systems, such as in-memory databases, search engines, and analytics platforms, who need efficient set representation and operations.
Developers choose CRoaring for its proven performance advantages over other compressed bitmap formats, its portable design with SIMD optimizations, and its adoption by major systems like Apache Doris, ClickHouse, and Redpanda, ensuring reliability and interoperability.
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, Alibaba Tair, Redpanda, YDB and StarRocks
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses runtime CPU dispatch to leverage AVX2, AVX-512, and NEON instructions, ensuring top-speed operations on modern hardware, as highlighted in benchmarks and microbenchmarks.
Implements Roaring bitmaps that significantly reduce memory usage compared to conventional bitsets while maintaining fast query speeds, backed by academic references and industry adoption.
Supports multiple compilers (GCC, Clang, Visual Studio, Xcode) and platforms (Linux, macOS, Windows, ARM, x64, POWER), with a consistent API for easy integration.
Enables safe concurrent access through reference counting and copy-on-write mechanisms, though it requires proper usage patterns like passing copies to threads.
IO serialization is not supported on big-endian systems, restricting its use in certain legacy or specialized architectures, as admitted in the README issue #423.
Setting up without CMake can be cumbersome; while amalgamation is offered, it requires manual handling and modern toolchains, which may not suit all build systems.
Optimal performance relies on SIMD-capable CPUs; on older or less common hardware, benefits are reduced, and runtime dispatch adds overhead, making it less ideal for heterogeneous environments.