A fast C library for integer compression using SIMD-accelerated StreamVByte codec.
StreamVByte is a C library for fast integer compression using the StreamVByte codec, which applies SIMD vectorization to Google's Group Varint approach. It compresses arrays of 32-bit integers with high speed, making it ideal for reducing memory footprint and improving I/O performance in data-intensive applications. The library is patent-free and optimized for modern x64 and ARM processors.
Developers building high-performance databases, search engines, or data processing systems where efficient integer storage and transmission are critical, such as in-memory databases, time-series databases, and information retrieval frameworks.
It offers significantly faster compression and decompression speeds than other byte-oriented integer compression techniques while maintaining a simple API. Its SIMD acceleration and differential coding make it a top choice for performance-sensitive applications requiring minimal overhead.
Fast integer compression in C using the StreamVByte codec
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Leverages SSE4.1 on x64 and NEON on ARM to achieve some of the fastest compression and decompression rates for integer arrays, as benchmarked in technical posts and used by high-performance databases like RediSearch.
Includes fast delta encoding for sorted integers, which significantly improves compression efficiency for sequential data like timestamps or IDs without adding complexity, as shown in the example code.
Released under the Apache License with no patent encumbrances, it's trusted in production systems like Facebook Thrift and StarRocks, ensuring legal safety and community validation.
Offers straightforward C functions like streamvbyte_encode with cross-platform support for Linux, macOS, and Windows via C99 compilers, reducing integration overhead in diverse environments.
Requires manual conversion using provided zigzag functions for signed integers, adding an extra step and potential performance overhead that isn't integrated into the core compression routines.
Explicitly does not support big-endian processors, limiting deployment in some embedded or legacy systems despite their rarity, as noted in the README's compatibility section.
The compressed stream omits the integer count, forcing developers to store this metadata separately and manage it during serialization, which can lead to errors if not handled carefully.
Optimal speed requires SSE4.1 or NEON support; on processors without these, performance may degrade, though the README doesn't detail fallback mechanisms, potentially impacting older hardware.