Showing 21 of 165 projects
A curated collection of Jupyter notebooks for digital humanities research and teaching, covering text analysis, data visualization, and more.
A lightweight Python library for building reproducible machine learning pipelines with minimal interface constraints.
A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.
A hands-on workshop introducing deep learning concepts with practical examples using neural networks, CNNs, RNNs, and autoencoders.
A Go implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm for extracting keywords from text.
A Ruby gem for lemmatizing English text, converting inflected words to their base dictionary forms.
An application that uses IBM Watson AI services and Cloud Functions to analyze videos, extracting visual and audio insights for search and categorization.
Rust edit distance library accelerated with SIMD for fast Hamming, Levenshtein, and Damerau-Levenshtein calculations.
A Python library providing German language support for TextBlob, enabling NLP tasks like tokenization, POS tagging, and sentiment analysis.
A Julia package providing high-performance, configurable tokenizers and sentence splitters for natural language processing.
A collection of tools, datasets, and approaches for building natural language interfaces to query the Web of Data.
Archived R package for accessing the Monkeylearn API for text classification and extraction.
Ruby bindings to the OpenNLP Java toolkit for natural language processing tasks like tokenization, POS tagging, and named entity recognition.
A collection of code samples demonstrating how to use Azure's Language Understanding (LUIS) service for natural language processing.
A tagger, lemmatizer, morphological analyzer, and dependency parser for Dutch using memory-based NLP modules.
A curated collection of books covering Artificial Intelligence, Machine Learning, Deep Learning, and Transformers for students and professionals.
A rule-based Unicode tokenizer that separates words from punctuation and splits sentences for NLP preprocessing.
A natural language date and time parser for Common Lisp, inspired by Ruby's Chronic.
A Ruby wrapper for the spaCy NLP library via PyCall, enabling tokenization, POS tagging, NER, and OpenAI integration.
Saul is a declarative domain-specific language in Scala for designing flexible machine learning models with relational feature extraction.
An Elixir natural language processor for tokenization, counting, and string similarity analysis.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.