A C++ header-only library implementing a fast and memory-efficient HAT-trie data structure for storing sets and maps of strings.
hat-trie is a C++ library that implements the HAT-trie data structure, a cache-conscious trie optimized for storing and querying large sets or maps of strings. It solves the problem of high memory usage in traditional hash tables and tries by compressing common prefixes, offering fast exact and prefix searches with low memory overhead.
C++ developers working with large dictionaries, autocompletion systems, or applications requiring efficient string storage and retrieval, such as search engines, databases, or text processing tools.
Developers choose hat-trie for its superior memory efficiency and performance in string-heavy workloads, providing a header-only, configurable alternative to standard containers like std::unordered_map while supporting advanced features like prefix searches and serialization.
C++ implementation of a fast and memory efficient HAT-trie
Benchmarks show hat-trie uses up to 3x less memory than std::unordered_map, such as 402.25 MiB vs 1246.60 MiB for the Wikipedia dataset, due to prefix compression.
Provides methods like equal_prefix_range and longest_prefix for efficient autocompletion and prefix matching, directly supported in the API as highlighted in the overview.
Easy integration as a header-only library; just include the directory, with CMake target support for streamlined build system integration.
Includes serialize and deserialize methods for efficient storage and transmission, with examples for custom serializers using Boost or standard streams.
Parameters like burst_threshold and max_load_factor allow balancing memory and speed; for example, reducing burst_threshold optimizes prefix search performance.
Keys are not stored in sorted order, limiting use cases that require ordered traversal or range queries without additional sorting overhead.
Any insert, erase, or update invalidates all iterators, complicating code that relies on stable references during concurrent or iterative operations.
Currently only supports char as the character type, with no native handling for wide characters like wchar_t, which may hinder use with Unicode or other encodings.
Serialization requires user-provided function objects, adding complexity compared to libraries with built-in serialization formats out of the box.
A family of header-only, very fast and memory-friendly hashmap and btree containers.
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, Alibaba Tair, Redpanda, YDB and StarRocks
Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
C++ implementation of a fast hash map and hash set using robin hood hashing
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.