A fast fuzzy string matching library for Ruby that implements the Jaro-Winkler distance algorithm.
fuzzy-string-match is a Ruby library that calculates the similarity between two strings using the Jaro-Winkler distance algorithm. It provides a fast, native C implementation for performance-critical applications and a pure Ruby version for compatibility. The library was ported from Apache Lucene to offer a reliable alternative to older, problematic gems.
Ruby developers who need to perform fuzzy string matching, such as in search functionality, data deduplication, or name matching applications. It's particularly useful for those requiring high performance with ASCII strings or UTF-8 support in pure Ruby.
Developers choose fuzzy-string-match for its speed, stability, and maintainability compared to alternatives like amatch. The native C implementation offers significant performance gains, while the pure Ruby version ensures broad compatibility, all backed by a clean port from the trusted Lucene library.
fuzzy string matching library for ruby
The C implementation runs over 80 times faster than pure Ruby in benchmarks, making it ideal for performance-critical ASCII string matching, as shown in the README's timing data.
Hand-ported from Apache Lucene 3.0.2, ensuring algorithm accuracy and fixing issues like memory leaks found in older alternatives like amatch, per the author's rationale.
Offers both native (ASCII-only, fast) and pure Ruby (UTF-8 compatible, slow) versions, allowing developers to balance speed and character set support based on needs.
Compatible with CRuby 2.0.0+ and JRuby 1.6.6+, including fallback to pure Ruby when native compilation fails, enhancing cross-platform usability.
Only implements Jaro-Winkler distance, forcing users to fork and port other algorithms manually if needed, as the README explicitly states.
The high-performance native version does not support UTF-8 strings, requiring a switch to the drastically slower pure Ruby version for international text, a significant trade-off.
Depends on RubyInline for the C extension, which can complicate installation on systems without proper compilers or in constrained environments, adding setup complexity.
The pure Ruby version is extremely slow (40+ seconds vs. 0.48 seconds for 1M operations in benchmarks), making it impractical for large-scale UTF-8 matching tasks.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.