Showing 16 of 268 projects
A tagger, lemmatizer, morphological analyzer, and dependency parser for Dutch using memory-based NLP modules.
Extract dates, times, emails, phone numbers, and other common patterns from text using pre-built regular expressions.
A Ruby gem for filtering stopwords from text with built-in support for multiple languages via Snowball lists.
A biomedical text corpus with 97 full-text articles annotated for concepts, coreferences, and structural elements.
A Docker-based speech recognition model that converts short English WAV audio files into text using Mozilla's DeepSpeech.
A Go package for n-gram based text categorization and language detection with UTF-8 support.
A rule-based Unicode tokenizer that separates words from punctuation and splits sentences for NLP preprocessing.
A flexible and general-purpose ngrams library written in Ruby, supporting various gram types, vocabulary models, and text analysis.
A natural language date and time parser for Common Lisp, inspired by Ruby's Chronic.
Go SDK for interacting with IBM Watson AI services, providing authentication, API clients, and utilities.
A Ruby wrapper for the spaCy NLP library via PyCall, enabling tokenization, POS tagging, NER, and OpenAI integration.
An Elixir natural language processor for tokenization, counting, and string similarity analysis.
A model-driven, rule-based natural language understanding system for high-precision information extraction at scale.
A CCG parser implementing all combinators with parsing to logical form and parameter estimation for probabilistic CCG.
A Go implementation of the MMSEG Chinese word segmentation algorithm for text processing.
A Swift library implementing the TextRank algorithm for automatic text summarization and keyword extraction.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.