Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Tags
  3. Nlp

Nlp

165 projects

Showing 21 of 165 projects

Jupyter Notebooks for Digital Humanities
Jupyter Notebooks for Digital Humanities

A curated collection of Jupyter notebooks for digital humanities research and teaching, covering text analysis, data visualization, and more.

#text-analysis#educational-resources#multilingual
Stars139
Forks19
Last commit3 years ago
steppy
steppyPython

A lightweight Python library for building reproducible machine learning pipelines with minimal interface constraints.

#experimentation#python-library#data-science
Stars136
Forks32
Last commit7 years ago
colibri-core
colibri-coreC++

A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.

#c-plus-plus-library#computational-linguistics#pattern-modeling
Stars130
Forks20
Last commit4 months ago
Introduction to Deep Learning Using Python (GitHub)
Introduction to Deep Learning Using Python (GitHub)Python

A hands-on workshop introducing deep learning concepts with practical examples using neural networks, CNNs, RNNs, and autoencoders.

#autoencoders#educational#deep-learning
Stars126
Forks80
Last commit
RAKE.go
RAKE.goGo

A Go implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm for extracting keywords from text.

#rake-algorithm#information-retrieval#text-analysis
Stars123
Forks19
Last commit1 year ago
lemmatizer
lemmatizerRuby

A Ruby gem for lemmatizing English text, converting inflected words to their base dictionary forms.

#text-analysis#nlp-tools#lemmatization
Stars112
Forks15
Last commit4 years ago
openwhisk-darkvisionapp
openwhisk-darkvisionappJavaScript

An application that uses IBM Watson AI services and Cloud Functions to analyze videos, extracting visual and audio insights for search and categorization.

#ibm-cloud#watson-ai#watson-visual-recognition
Stars110
Forks253
Last commit
triple_accel
triple_accelRust

Rust edit distance library accelerated with SIMD for fast Hamming, Levenshtein, and Damerau-Levenshtein calculations.

#string-similarity#simd#string-matching
Stars110
Forks15
Last commit3 years ago
textblob-de
textblob-dePython

A Python library providing German language support for TextBlob, enabling NLP tasks like tokenization, POS tagging, and sentiment analysis.

#german-language#textblob-extension#python-library
Stars103
Forks12
Last commit1 year ago
Word Tokenizers
Word TokenizersJulia

A Julia package providing high-performance, configurable tokenizers and sentence splitters for natural language processing.

#julia#computational-linguistics#sentence-splitting
Stars100
Forks25
Last commit4 years ago
NLIWOD's Question answering datasets
NLIWOD's Question answering datasetsJava

A collection of tools, datasets, and approaches for building natural language interfaces to query the Web of Data.

#web-of-data#question-answering#natural-language-interfaces
Stars93
Forks31
Last commit
MonkeyLearn
MonkeyLearnR

Archived R package for accessing the Monkeylearn API for text classification and extraction.

#text-extraction#peer reviewed#text-classification
Stars92
Forks16
Last commit4 years ago
open-nlp
open-nlpRuby

Ruby bindings to the OpenNLP Java toolkit for natural language processing tasks like tokenization, POS tagging, and named entity recognition.

#java bindings#jruby#pos-tagging
Stars91
Forks11
Last commit1 year ago
Language Understanding (LUIS) Samples
Language Understanding (LUIS) SamplesC#

A collection of code samples demonstrating how to use Azure's Language Understanding (LUIS) service for natural language processing.

#language-understanding#chatbots#azure
Stars86
Forks135
Last commit
frog
frogC++

A tagger, lemmatizer, morphological analyzer, and dependency parser for Dutch using memory-based NLP modules.

#c-plus-plus-library#computational-linguistics#memory-based-learning
Stars81
Forks12
Last commit1 month ago
AI Books
AI Books

A curated collection of books covering Artificial Intelligence, Machine Learning, Deep Learning, and Transformers for students and professionals.

#ai#python-ml#ai-agent
Stars79
Forks6
Last commit10 months ago
ucto
uctoC++

A rule-based Unicode tokenizer that separates words from punctuation and splits sentences for NLP preprocessing.

#nlp-library#computational-linguistics#rule-based
Stars71
Forks14
Last commit1 month ago
chronicity
chronicityCommon Lisp

A natural language date and time parser for Common Lisp, inspired by Ruby's Chronic.

#datetime#natural-language-processing#date-parsing
Stars70
Forks14
Last commit7 years ago
ruby-spacy
ruby-spacyRuby

A Ruby wrapper for the spaCy NLP library via PyCall, enabling tokenization, POS tagging, NER, and OpenAI integration.

#parsing#nlp-library#spacy
Stars67
Forks6
Last commit3 months ago
Saul
SaulScala

Saul is a declarative domain-specific language in Scala for designing flexible machine learning models with relational feature extraction.

#declarative-programming#learning-models#ai-systems
Stars65
Forks18
Last commit6 years ago
gibran
gibranElixir

An Elixir natural language processor for tokenization, counting, and string similarity analysis.

#string-similarity#text-metrics#nlp-library
Stars65
Forks3
Last commit9 years ago
PreviousPage 5 of 5

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub
10 years ago
6 years ago
4 years ago
3 years ago
#Machine Learning90
#Natural Language Processing89
#Deep Learning46
#Python40
#Text Analysis33
#Python Library25
#Neural Networks25
#Named Entity Recognition21
#Tensorflow20
#Data Science20
#Computer Vision18
#Ai16