A modern, flexible regular expression library supporting multiple character encodings and syntaxes.
Oniguruma is a modern, flexible regular expression library written in C. It provides a comprehensive set of regex features drawn from various programming language implementations, supporting numerous character encodings per regex object. It solves the problem of needing a robust, encoding-aware regex engine for applications that process text in multiple languages and formats.
Developers and system programmers building applications that require advanced text processing with support for multiple character encodings, particularly those working on internationalization, text editors, or language tooling.
Developers choose Oniguruma for its extensive character encoding support, modern regex feature set combining the best from different languages, and multiple API options (native, POSIX, GNU) for compatibility. Its performance and flexibility make it a reliable choice for complex text matching tasks.
regular expression library
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports per-regex encoding for over 30 character sets including UTF-8, UTF-16, and various EUC variants, as listed in the README, making it ideal for international text processing.
Incorporates features from multiple languages, with Unicode properties and advanced operators like (*SKIP) added in recent updates, enhancing flexibility for complex patterns.
Offers native, POSIX, and GNU regex APIs, providing versatility for different programming environments, as detailed in the usage and source files section.
Available on Linux, Unix, Windows via Visual Studio or vcpkg, and Cygwin, with clear installation steps for each platform, including package manager commands.
The project officially ended on April 24, 2025, as stated in the README, meaning no future updates, bug fixes, or security patches, which is a significant risk for production use.
Requires steps like autoreconf, ./configure, and make for manual installation, which can be error-prone compared to simpler libraries with one-step setups.
Changes in POSIX API handling require specific configure flags (--enable-binary-compatible-posix-api) to maintain compatibility, indicating potential breaking changes and added complexity.
As a C library, it lacks built-in bindings for higher-level languages, requiring additional wrappers or effort for integration compared to language-native regex solutions.