A command-line tool that performs semantic searches on text using word embeddings to find words with similar meaning to the query.
grep for words with similar meaning to the query
Uses word embeddings to find conceptually similar terms, enabling searches based on meaning rather than exact text, as shown in the example finding matches for 'death' in literary text.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Works with fasttext models for over 157 languages, and the README provides tools to convert and use these models, making it versatile for international text.
Designed to mimic traditional grep with options like -C for context and -n for line numbers, reducing the learning curve for command-line users.
Adjustable similarity threshold via --threshold allows fine-tuning of match precision, useful for balancing recall and accuracy in searches.
Requires downloading and managing large embedding models (e.g., GoogleNews-vectors-negative300-SLIM.bin is 346MB), which can be cumbersome and memory-intensive, as noted in the model download instructions.
Installation involves multiple steps: compiling or downloading binaries, obtaining model files separately (with git-lfs issues mentioned), and configuring paths via JSON, which can be error-prone.
Loading large models can slow down initial searches and increase memory usage, with the README admitting that model size reduction tools are needed for better performance.