Showing 36 of 76 projects
A curated list of open-access resources and tools for Natural Language Processing (NLP) focused on the German language.
A reading comprehension dataset with Wikipedia summaries, full stories, and question-answer pairs for narrative understanding.
A text file analysis tool that detects non-inclusive language in source code and suggests inclusive alternatives.
A Go library implementing selected machine learning algorithms for natural language processing and semantic analysis.
A Ruby gem for simple sentiment analysis that classifies text as positive, negative, or neutral based on configurable thresholds.
A Node.js sample application demonstrating the IBM Watson Tone Analyzer service for detecting emotional and language tones in text.
Ruby bindings for the Stanford CoreNLP natural language processing toolkit, supporting English, French, and German.
An R package for creating interactive and customizable word cloud visualizations using wordcloud2.js.
A curated list of tools, resources, and services for humanities scholars using quantitative or computational methods.
A Julia package providing standard tools and models for text analysis and natural language processing.
Python implementations of various topic modeling algorithms including LDA, collaborative topic models, and hierarchical Dirichlet processes.
A curated collection of linguistic resources, datasets, and tools for Natural Language Processing and Computational Linguistics on Spanish.
Fast and portable character string processing in R using the Unicode ICU library.
A curated collection of learning resources, R packages, and practical examples for understanding and applying topic modeling techniques.
A Go implementation of the TextRank algorithm for automatic text summarization, phrase extraction, and keyword ranking with multithreading support.
An R package with GUI for computational stylistics and authorship attribution through statistical text analysis.
A comprehensive Natural Language Processing (NLP) library for the Crystal programming language.
A community-curated list of NLP tools, libraries, datasets, and resources across speech processing, text analysis, and machine translation.
A grep-like CLI utility that searches text files using Lucene query syntax, compiled to a native binary for fast startup.
A lightweight JavaScript proofreader that checks writing style, readability, and common errors in text.
A pure Go library for fast, offline natural language detection supporting 29 languages.
A Ruby natural language processor for tokenizing and analyzing text with flexible filtering and custom regex support.
A rule-based question classification system for Node.js that categorizes questions by type and answer format.
Interactive topic model visualization and interpretation library for Python, compatible with sklearn, Gensim, BERTopic, and Turftopic.
A Node.js sample application demonstrating IBM Watson Natural Language Understanding service features.
A curated collection of Jupyter notebooks for digital humanities research and teaching, covering text analysis, data visualization, and more.
A fast implementation of the Porter stemming algorithm for English word normalization in natural language processing.
A Ruby wrapper for Latent Dirichlet Allocation (LDA) that clusters documents into topics with native, Rust, and pure Ruby backends.
A Java framework for developing statistical natural language processing (NLP) components on Apache UIMA.
A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.
A Go implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm for extracting keywords from text.
A Ruby gem for lemmatizing English text, converting inflected words to their base dictionary forms.
A multilingual Ruby gem for splitting strings into tokens with extensive language support and configurable options.
A high-level Python toolbox for topic modeling with easy-to-use functions and command-line interface.
Archived R package for accessing the Monkeylearn API for text classification and extraction.
TensorFlow implementation of hierarchical attention networks for document classification using GRU cells and attention mechanisms.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.