Showing 36 of 267 projects
Script to generate question/answer pairs from CNN and Daily Mail articles for machine reading comprehension research.
A portable C library implementing Perl-compatible regular expression pattern matching with Unicode support and optional JIT compilation.
A Go library and command-line tool to extract URLs from text using regular expressions.
A clean C library for Unicode normalization, case-folding, and UTF-8 processing.
A PHP library for converting emoji between native mobile formats and HTML display.
A Scala library for building fast parsers using parser combinators with minimal boilerplate.
A command-line tool for formatting, highlighting, and extracting content from XML and HTML documents.
A JavaScript library to generate URL slugs with transliteration and extensive customization options.
A lightweight, fast, and flexible parser combinator library for C#.
A curated list of awesome resources, libraries, and tools for natural language processing (NLP) in Ruby.
A curated list of awesome resources, libraries, and tools for natural language processing (NLP) in Ruby.
A Node.js library that converts strings to URL-safe slugs, handling Unicode characters and symbols.
A C# implementation of the CommonMark specification for converting Markdown to HTML, optimized for performance and portability.
A super fast, highly extensible markdown parser for PHP supporting multiple flavors like GitHub, Markdown Extra, and traditional Markdown.
A standards-compliant, fast, and secure C library for parsing and rendering Markdown to HTML.
A library for deterministic finite automata (DFA) regular expressions and lexical analysis tools.
A curated collection of Unicode resources, quirks, and creative uses for developers and enthusiasts.
A curated collection of Unicode resources, quirks, and creative uses for developers.
A curated collection of Unicode resources, character quirks, and practical examples for developers.
A POSIX-compliant regex library with approximate (fuzzy) matching and predictable performance.
A minimal Go template engine focused solely on high-speed placeholder substitution without escaping.
A pure Elixir library for parsing Markdown into HTML and AST with extensive customization options.
A Unicode-aware string reverser for JavaScript that correctly handles combining marks and astral symbols.
A Ruby gem that transforms plain text into HTML using a pipeline of composable filters.
A Go implementation of a trie data structure with algorithms for extremely fast prefix and fuzzy string searching.
An opinionated, CommonMark-compliant Markdown formatter and Python library for enforcing consistent style.
A curated list of interesting Unicode characters with unique features, quirks, and fun uses.
Interpreted string literals for R that embed expressions in curly braces for easy data interpolation.
A sharp cut(1) clone with regex delimiters, column reordering, and automatic decompression for data exploration.
Fastest JavaScript implementation of the Levenshtein distance algorithm for measuring string similarity.
A Go library for measuring the display width of characters and strings, handling East Asian fullwidth characters.
Convert dash/dot/underscore/space separated strings to camelCase or PascalCase with Unicode support.
A natural language detection library for Go that identifies 84 languages and scripts with no external dependencies.
A Swift library for tokenizing strings using character sets and custom tokenizers when whitespace splitting is insufficient.
A regular expressions library forked from Oniguruma, focusing on Perl 5.10+ features and used as Ruby's default regex engine.
A regular expressions library forked from Oniguruma, focusing on Perl 5.10+ features and used as Ruby's default regex engine.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.