A comprehensive Go library for string comparison and edit distance algorithms, including Levenshtein, LCS, Hamming, Jaro-Winkler, and Cosine similarity.
Go-edlib is a Go library that implements a wide range of edit distance and string comparison algorithms, designed for full Unicode compatibility. It provides tools for measuring string similarity, performing fuzzy searches, and computing differences, making it valuable for applications like spell-checking, data deduplication, and natural language processing.
Go developers working on text processing, data cleaning, or NLP tasks that require fuzzy string matching, similarity scoring, or diff generation. This includes developers building spell-checkers, search engines, data deduplication systems, or tools for natural language processing.
Developers choose Go-edlib for its comprehensive collection of algorithms (including Levenshtein, LCS, Jaro-Winkler, Cosine, and more) in a single, performant, and easy-to-use library with full Unicode support, eliminating the need to integrate multiple specialized packages.
📚 String comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Includes over 10 algorithms from Levenshtein to Cosine similarity, covering most common edit distance needs in a single library, as shown in the features list.
Designed for accurate processing of international text, ensuring no issues with non-ASCII characters in all functions.
Provides functions like FuzzySearchSet with thresholds and result limits, making it easy to implement fuzzy matching without extra code.
Offers detailed godoc pages and extensive examples in the README, including code snippets for similarity scoring and diff generation.
Lacks support for phonetic matching methods like Soundex, which are crucial for applications such as spell-checking with homophones or name matching.
Does not include built-in concurrency features, so scaling to large datasets requires manual parallelization, as noted in the absence of goroutine-safe implementations.
Benchmarks are hosted on external sites (e.g., interactive charts), which might lead to broken links or outdated information if not maintained.