A grep-like CLI utility that searches text files using Lucene query syntax, compiled to a native binary for fast startup.
lucene-grep is a grep-like command-line utility built on Apache Lucene, allowing developers to search text files using Lucene's advanced query syntax. It solves the problem of performing complex, language-aware text searches directly from the terminal, offering features like stemming, phrase matching, and customizable text analysis that traditional grep lacks.
Developers and data engineers who need to perform advanced text searches on log files, codebases, or documents from the command line, especially those familiar with Lucene's query syntax or requiring linguistic processing.
It combines the familiarity of grep with the power of Lucene's search engine, providing a fast, native binary with extensive query capabilities and text analysis options not available in standard Unix tools.
Grep-like utility based on Lucene Monitor compiled with GraalVM native-image
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports full Lucene query language including phrases, wildcards, and proximity searches, enabling complex pattern matching beyond regular expressions, as shown in the phrase matching and slop examples.
Offers configurable pipelines with char filters, tokenizers, and stemmers for multiple languages, allowing linguistic processing like stopword removal and stemming, detailed in the analysis configuration section.
Compiled with GraalVM to a standalone binary for fast startup and low memory usage across Linux, macOS, and Windows, with benchmarks provided in the README.
Supports colored, tagged, JSON, and EDN outputs with customizable templates, facilitating integration into various workflows and tools.
Allows continuous input from STDIN with dynamic queries via streamed matching, avoiding cold starts for real-time text processing.
Deviates from grep syntax and requires learning Lucene query language, with the README explicitly noting incompatibility and limited functionality compared to grep.
Text analysis setup involves verbose JSON and external resource files, adding overhead for basic searches and requiring familiarity with Lucene components.
Certain features like --skip-binary-files are available only for Linux and macOS, reducing consistency and usability on Windows.
Some capabilities, such as --with-scored-highlights, are marked as ALPHA, indicating potential instability and risk for production use.