A Rust library for compact ordered sets and maps using finite state transducers, enabling fast searches and range queries.
fst is a Rust library that provides fast implementations of ordered sets and maps using finite state transducers. It solves the problem of storing and querying large datasets compactly by leveraging finite state machines, enabling efficient range queries and searches. The library is designed for high-performance applications where memory efficiency and search speed are critical.
Rust developers working with large collections of keys, such as those building search engines, databases, or text processing tools that require compact storage and fast lookup operations.
Developers choose fst for its unique combination of compact storage via finite state transducers and fast query performance, including support for fuzzy searches and integration with automata libraries, making it ideal for handling massive datasets efficiently.
Represent large sets and maps compactly with finite state transducers.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses finite state transducers to map keys to values, enabling memory-efficient representation of large datasets, as demonstrated in the blog post about indexing billions of keys with minimal memory usage.
Leverages memory mapping to perform range searches efficiently, making it ideal for ordered datasets where quick lexicographic lookups are needed.
Includes automata like Levenshtein for approximate string matching when enabled via the 'levenshtein' feature, allowing flexible querying without exact key matches.
Compatible with the regex-automata crate via the Automata trait, enabling the use of DFAs to search transducers for advanced pattern matching scenarios.
Finite state transducers are built once and not easily updated; modifying keys requires rebuilding the entire structure, which is inefficient for dynamic datasets.
Primarily designed for lexicographically sorted string keys, limiting its use for other data types or unordered collections without additional workarounds.
Advanced features like Levenshtein automaton require enabling cargo features and add dependencies (e.g., utf8-ranges), increasing setup complexity for basic use cases.