Showing 36 of 267 projects
A C/C++ library implementing Unicode algorithms with strict security, performance, and portability, handling ill-formed UTF sequences correctly.
Fast and portable character string processing in R using the Unicode ICU library.
A Rust library for character encoding conversion based on the WHATWG Encoding Standard.
A fast fuzzy string matching library for Ruby that implements the Jaro-Winkler distance algorithm.
An Elixir library for extracting and curating the primary readable content from webpages.
A Rust library providing fast linear time and space suffix arrays with full Unicode support.
A Unicode-aware lexer generator for OCaml that embeds lexer specifications directly in OCaml source files.
A Node.js package that provides a collection of cat-themed ASCII emoticons for use in CLI tools and JavaScript projects.
A pure Lua port of LPeg, a Parsing Expression Grammars library for pattern matching and text processing.
A simple, developer-friendly text localization library for Clojure and ClojureScript applications.
A Scala library for natural language processing with functional and actor-based pipelines.
A pure Go library for programmatically reading from and writing to Microsoft Word DOCX files.
A pure OCaml regular expression library supporting Perl, POSIX, Emacs, and glob patterns with DFA-based matching.
A JavaScript library that converts URLs into human-readable formats by removing protocol and www prefixes.
A Rust tool for drawing low-resolution graphs directly in the terminal for quick data analysis from logs and text files.
A comprehensive Go library for string manipulation including case conversion, padding, truncation, and special character handling.
Convert camelCase strings to lowercase with custom separators like unicornRainbow → unicorn_rainbow.
A fast monadic-style parser combinator library for stable Rust, enabling expressive and performant parsing.
A comprehensive and extensible natural language processing toolkit for Common Lisp, supporting custom pipelines and experimentation.
An efficient command-line tool and library for filtering duplicate lines from textual input, optimized for speed and memory usage.
A Pascal library for converting Markdown to HTML with support for multiple dialects including CommonMark and GitHub Flavored Markdown.
A Ruby gem providing a fast, accurate, and encoding-aware implementation of the Jaro-Winkler string similarity algorithm.
A Go tool that shortens strings using common abbreviations and smart word boundary detection for DevOps resource naming.
A pure Go production-grade regex engine with SIMD optimizations, offering 3-3000x speedup over the standard library.
A Go library for constructing regular expressions using a human-friendly, composable builder pattern.
A Python toolkit for text-focused data science on medium-sized datasets, bridging memory and cluster-scale processing.
A comprehensive Go library providing string manipulation functions for formatting, transformation, and analysis.
An umbrella project providing cross-platform Common Lisp libraries for building large, interactive applications including game development.
Detects the indentation type and amount from a string of code to maintain consistent formatting.
A Swift string extension for intelligent pluralization with support for irregular nouns, uncountable nouns, and custom rules.
A native Go implementation of the Porter Stemming algorithm for NLP and machine learning tasks.
Find and paste Unicode symbols using a Python script or Alfred workflow.
A pure Object Pascal regular expressions engine for Delphi and Free Pascal.
A lightweight lexical string parser for BBCode styled markup, converting custom tags to HTML.
A simple and lightweight fuzzy search engine that works in memory, searching for similar strings.
A Neovim plugin that implements LTeX language server's off-spec code actions for dictionary management and rule handling.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.