Showing 36 of 268 projects
An English (Porter2) stemming implementation in Elixir for reducing words to their base forms.
A .NET Standard library for accessing IBM Watson cognitive services like Assistant, Discovery, and Speech-to-Text.
A TensorFlow-based neural network model for generating descriptive captions from images using Flickr30K and MSCOCO datasets.
A Python toolbox using deep belief networks for topic modeling on document data, producing latent representations for content-based recommendation.
A Ruby interface to the WordNet lexical database, enabling natural language processing and linguistic analysis.
A Node.js sample application demonstrating IBM Watson Natural Language Understanding service features.
A fast implementation of the Porter stemming algorithm for English word normalization in natural language processing.
A Ruby wrapper for Latent Dirichlet Allocation (LDA) that clusters documents into topics with native, Rust, and pure Ruby backends.
A Java framework for developing statistical natural language processing (NLP) components on Apache UIMA.
A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.
A Go implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm for extracting keywords from text.
A graphical syntax tree generator for linguistic research that creates publication-quality tree diagrams from bracket notation.
A Ruby gem that extracts structured date, time, and message information from naturally worded text.
A Ruby gem for lemmatizing English text, converting inflected words to their base dictionary forms.
A Python library providing German language support for TextBlob, enabling NLP tasks like tokenization, POS tagging, and sentiment analysis.
A Node.js implementation of Martin Porter's stemming algorithm for removing morphological endings from English words.
A Julia package providing high-performance, configurable tokenizers and sentence splitters for natural language processing.
A natural language processing library for Uralic and other languages, offering morphological analysis, generation, lemmatization, and lexical information.
A Zsh plugin that converts natural language descriptions into shell commands using AI, with ghost text preview.
A characteristic-rich dataset for factoid question answering with explicit question specifications to enable fine-grained QA system evaluation.
A high-level Python toolbox for topic modeling with easy-to-use functions and command-line interface.
A multilingual Ruby gem for splitting strings into tokens with extensive language support and configurable options.
Archived R package for accessing the Monkeylearn API for text classification and extraction.
Ruby bindings for Stanford NLP tools providing part-of-speech tagging and named entity recognition capabilities.
Ruby bindings to the OpenNLP Java toolkit for natural language processing tasks like tokenization, POS tagging, and named entity recognition.
A Ruby port of the NLTK Punkt algorithm for unsupervised, language-independent sentence boundary detection.
A collection of refactored, high-quality Android examples demonstrating TensorFlow Lite for on-device machine learning tasks.
A Go library for Unicode text segmentation at word boundaries as defined by Unicode Standard Annex #29.
TensorFlow implementation of hierarchical attention networks for document classification using GRU cells and attention mechanisms.
A collection of code samples demonstrating how to use Azure's Language Understanding (LUIS) service for natural language processing.
A Directus extension that adds an AI-powered chat interface to query and analyze your data using OpenAI.
A Python pipeline for multilingual text clustering using Latent Dirichlet Allocation with stop words removal, n-gram features, and inverse stemming.
A Julia package for loading pretrained word embeddings like Word2Vec, FastText, and GloVe.
Simple sentiment analysis for Elixir based on AFINN-165 with emoji, booster, and negator support.
A Ruby natural language parser for recurring events that interprets expressions like 'every 2 days' or 'Sundays'.
A collection of Node-RED nodes to integrate IBM Watson AI services like speech, language, and conversation into applications.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.