Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. NLP with Ruby
  3. fuzzy-string-match

fuzzy-string-match

Apache-2.0Ruby

A fast fuzzy string matching library for Ruby that implements the Jaro-Winkler distance algorithm.

GitHubGitHub
287 stars37 forks0 contributors

What is fuzzy-string-match?

fuzzy-string-match is a Ruby library that calculates the similarity between two strings using the Jaro-Winkler distance algorithm. It provides a fast, native C implementation for performance-critical applications and a pure Ruby version for compatibility. The library was ported from Apache Lucene to offer a reliable alternative to older, problematic gems.

Target Audience

Ruby developers who need to perform fuzzy string matching, such as in search functionality, data deduplication, or name matching applications. It's particularly useful for those requiring high performance with ASCII strings or UTF-8 support in pure Ruby.

Value Proposition

Developers choose fuzzy-string-match for its speed, stability, and maintainability compared to alternatives like amatch. The native C implementation offers significant performance gains, while the pure Ruby version ensures broad compatibility, all backed by a clean port from the trusted Lucene library.

Overview

fuzzy string matching library for ruby

Use Cases

Best For

  • Calculating similarity between names or addresses in data cleaning
  • Implementing fuzzy search in Ruby applications
  • Data deduplication tasks where string variations exist
  • Matching strings with typos or minor differences
  • Educational projects exploring string distance algorithms
  • Performance-critical text processing in Ruby

Not Ideal For

  • Projects needing multiple string similarity algorithms beyond Jaro-Winkler, such as Levenshtein or cosine similarity
  • Applications requiring high-performance fuzzy matching with Unicode or non-ASCII text, as the native version is ASCII-only
  • Teams in environments where compiling native C extensions is problematic, like some cloud platforms or with strict security policies
  • Developers seeking an all-in-one gem with pre-optimized algorithms for diverse use cases without manual porting

Pros & Cons

Pros

Blazing Fast Native Performance

The C implementation runs over 80 times faster than pure Ruby in benchmarks, making it ideal for performance-critical ASCII string matching, as shown in the README's timing data.

Stable and Trustworthy Implementation

Hand-ported from Apache Lucene 3.0.2, ensuring algorithm accuracy and fixing issues like memory leaks found in older alternatives like amatch, per the author's rationale.

Flexible Dual-Mode Design

Offers both native (ASCII-only, fast) and pure Ruby (UTF-8 compatible, slow) versions, allowing developers to balance speed and character set support based on needs.

Broad Ruby Platform Support

Compatible with CRuby 2.0.0+ and JRuby 1.6.6+, including fallback to pure Ruby when native compilation fails, enhancing cross-platform usability.

Cons

Single-Algorithm Limitation

Only implements Jaro-Winkler distance, forcing users to fork and port other algorithms manually if needed, as the README explicitly states.

ASCII Bottleneck in Native Mode

The high-performance native version does not support UTF-8 strings, requiring a switch to the drastically slower pure Ruby version for international text, a significant trade-off.

Native Compilation Overhead

Depends on RubyInline for the C extension, which can complicate installation on systems without proper compilers or in constrained environments, adding setup complexity.

Severe Performance Disparity

The pure Ruby version is extremely slow (40+ seconds vs. 0.48 seconds for 1M operations in benchmarks), making it impractical for large-scale UTF-8 matching tasks.

Frequently Asked Questions

Quick Stats

Stars287
Forks37
Contributors0
Open Issues8
Last commit6 years ago
CreatedSince 2010

Tags

#text-processing#string-matching#fuzzy-search#native-extension#jaro-winkler#ruby

Built With

R
Ruby
R
RSpec
C
C++

Included in

NLP with Ruby1.1k
Auto-fetched 2 hours ago

Related Projects

verbal_expressionsverbal_expressions

Make difficult regular expressions easy! Ruby port of the awesome VerbalExpressions repo - https://github.com/jehna/VerbalExpressions

Stars570
Forks25
Last commit3 years ago
regexp-examplesregexp-examples

Generate strings that match a given regular expression

Stars521
Forks32
Last commit2 years ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub