A modern, modular, and efficient lossless data compressor in C++ that combines multiple algorithms and multi-threading for high performance.
Kanzi is a modern lossless data compressor written in C++ that combines multiple compression algorithms like Burrows-Wheeler Transform (BWT) and Context Modeling (CM) to achieve higher compression ratios than traditional LZ-based tools. It solves the problem of underutilized modern hardware by implementing built-in multi-threading to compress data in parallel across CPU cores, making it efficient for high-performance scenarios like backups and real-time data transfers.
Developers and engineers working with large datasets who need high-performance compression beyond standard tools, particularly those dealing with diverse data types (multimedia, DNA, text) or requiring customizable compression pipelines.
Developers choose Kanzi for its ability to achieve compression ratios unattainable with traditional LZ methods while maintaining performance through multi-core utilization. Its modular, extensible architecture with no external dependencies makes it uniquely adaptable for specialized data types and experimental compression techniques.
Fast lossless data compression in C++
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Combines Burrows-Wheeler Transform (BWT) and Context Modeling (CM) with other techniques to achieve compression ratios beyond traditional LZ methods, as shown in benchmarks outperforming zstd and brotli at high levels.
Designed for concurrency, compressing multiple blocks in parallel across CPU cores to fully utilize modern hardware, leading to significant performance gains on multi-core systems.
Includes customizable transforms for specific data types like multimedia and DNA, optimizing compression efficiency where generic algorithms fall short, as highlighted in the README.
Interface-driven architecture with no external dependencies allows easy integration and extension with new entropy codecs or transforms, making it developer-friendly for experimentation.
Produces compressed output in a proprietary format not compatible with common tools like gzip or zstd, limiting drop-in usability and requiring custom handling for decompression.
Lacks capabilities such as cross-file deduplication and data recovery mechanisms, as it is purely a data compressor, not an archiver, which may necessitate additional tools for full solutions.
At higher compression levels, encoding and decoding times increase substantially, as benchmarks show level 9 taking over 11 seconds for silesia.tar, making it slower than some alternatives for speed-critical tasks.