Showing 36 of 267 projects
A binary pattern match-based Mustache template engine for Erlang/OTP that avoids regular expressions.
A PHP string manipulation library with multibyte support, optimized for performance and PHP 7+.
A JavaScript library that accurately calculates string length by handling astral symbols and ANSI escape codes.
A fast regular expression engine for Common Lisp that compiles regexes to machine code using derivative-based DFA compilation.
A Ruby natural language processor for tokenizing and analyzing text with flexible filtering and custom regex support.
A Go library and CLI for emoji lookup, search, and categorization with GitHub emoji support.
A lightweight, dependency-free JavaScript library for parsing and rendering GitHub emoji in text.
A command-line tool that cleans up LaTeX files by removing comments and correcting common anti-patterns.
A .NET library and toolset for working with GitHub emoji aliases and Unicode characters across C#, ASP.NET Core, Blazor, and command-line tools.
A .NET library and toolset for working with GitHub emoji aliases and Unicode characters across C#, ASP.NET Core, Blazor, and command-line tools.
A systematic R package for parsing strings and converting them to snake_case, camelCase, and other naming conventions.
A natural language processing framework for JVM languages with comprehensive linguistic analysis tools.
Ruby bindings to RE2, a fast, safe, thread-friendly alternative to backtracking regex engines like PCRE.
An English (Porter2) stemming implementation in Elixir for reducing words to their base forms.
A Ruby gem for calculating edit distance between strings using Levenshtein, Damerau-Levenshtein, and Boehmer & Rees algorithms.
A Go library that maps regex named groups into struct fields using struct tags and automatic parsing.
Error-recovering streaming HTML5 and XML parsers for OCaml with lazy, non-blocking, and one-pass processing.
A .NET library implementing various string similarity and distance metrics like Levenshtein, Jaro-Winkler, and Soundex.
Strip leading whitespace from each line in a string, removing redundant indentation based on the least-indented line.
A Go library for converting Unicode text to ASCII transliterations, inspired by python-unidecode.
An Elixir library for natural language and script detection using statistical analysis without AI.
An AutoHotkey library for manipulating text files and strings with over 40 functions for line operations, search/replace, and formatting.
An OCaml template engine with near-complete compatibility with Jinja2 syntax and features.
A high-performance, regex-free Go tokenizer for parsing strings, slices, and infinite streams into customizable tokens.
A fast implementation of the Porter stemming algorithm for English word normalization in natural language processing.
A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.
A comprehensive Unicode library for OCaml providing character handling, string encodings, collation, and locale-sensitive operations.
A fast, CommonMark-compliant markdown parser written in Crystal with syntax highlighting and customization options.
A Ruby gem that extracts structured date, time, and message information from naturally worded text.
A Node.js utility to add consistent indentation to each line of a string with customizable options.
A C++ regular expression library that is the ancestor to std::regex and offers extended functionality.
A pure LuaJIT implementation of LPeg v1.0, a PEG pattern matching library for Lua, with added left recursion support.
A collection of PowerShell modules for remoting, secret management, and text utilities, published to PowerShellGallery.com.
A cross-platform CLI tool for cleaning and improving text datasets for machine learning, with fast operations and LLM-based filtering.
A Ruby gem for lemmatizing English text, converting inflected words to their base dictionary forms.
A lightweight, auto-generated Java library for working with Unicode emojis, featuring type-safe constants and comprehensive utility methods.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.