Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Tags
  3. Natural Language Processing

Natural Language Processing

268 projects

Showing 36 of 268 projects

stemmer
stemmerElixir

An English (Porter2) stemming implementation in Elixir for reducing words to their base forms.

#nlp-library#elixir#information-retrieval
Stars154
Forks10
Last commit2 years ago
dotnet-standard-sdk
dotnet-standard-sdkC#

A .NET Standard library for accessing IBM Watson cognitive services like Assistant, Discovery, and Speech-to-Text.

#hacktoberfest#cloud-ai#natural-language-processing
Stars148
Forks114
Last commit
Image Caption Generator
Image Caption GeneratorJupyter Notebook

A TensorFlow-based neural network model for generating descriptive captions from images using Flickr30K and MSCOCO datasets.

#neural-network#deep-learning#captioning-images
Stars145
Forks55
Last commit
Deep Belief Nets for Topic Modeling
Deep Belief Nets for Topic ModelingPython

A Python toolbox using deep belief networks for topic modeling on document data, producing latent representations for content-based recommendation.

#deep-belief-networks#research-tool#document-analysis
Stars144
Forks56
Last commit
wordnet
wordnetRuby

A Ruby interface to the WordNet lexical database, enabling natural language processing and linguistic analysis.

#semantic-analysis#lexical-database#ruby-gem
Stars140
Forks25
Last commit3 years ago
natural-language-understanding-nodejs
natural-language-understanding-nodejsJavaScript

A Node.js sample application demonstrating IBM Watson Natural Language Understanding service features.

#sample-app#natural-language-understanding#text-analysis
Stars140
Forks160
Last commit
stemmer
stemmerJavaScript

A fast implementation of the Porter stemming algorithm for English word normalization in natural language processing.

#stemmer#stemming#text-analysis
Stars137
Forks8
Last commit3 years ago
lda-ruby
lda-rubyRuby

A Ruby wrapper for Latent Dirichlet Allocation (LDA) that clusters documents into topics with native, Rust, and pure Ruby backends.

#text-analysis#ruby-wrapper#ruby-gem
Stars134
Forks30
Last commit1 month ago
ClearTK
ClearTKJava

A Java framework for developing statistical natural language processing (NLP) components on Apache UIMA.

#statistical-nlp#text-analysis#language-processing
Stars133
Forks58
Last commit3 years ago
colibri-core
colibri-coreC++

A C++ and Python library for efficient extraction and analysis of n-grams, skipgrams, and flexgrams from large corpora.

#c-plus-plus-library#computational-linguistics#pattern-modeling
Stars130
Forks20
Last commit4 months ago
RAKE.go
RAKE.goGo

A Go implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm for extracting keywords from text.

#rake-algorithm#information-retrieval#text-analysis
Stars123
Forks19
Last commit1 year ago
rsyntaxtree
rsyntaxtreeRuby

A graphical syntax tree generator for linguistic research that creates publication-quality tree diagrams from bracket notation.

#research-tool#academic-software#diagram-generator
Stars121
Forks18
Last commit1 month ago
nickel
nickelRuby

A Ruby gem that extracts structured date, time, and message information from naturally worded text.

#datetime#reminders#time-parsing
Stars118
Forks17
Last commit8 years ago
lemmatizer
lemmatizerRuby

A Ruby gem for lemmatizing English text, converting inflected words to their base dictionary forms.

#text-analysis#nlp-tools#lemmatization
Stars112
Forks15
Last commit4 years ago
textblob-de
textblob-dePython

A Python library providing German language support for TextBlob, enabling NLP tasks like tokenization, POS tagging, and sentiment analysis.

#german-language#textblob-extension#python-library
Stars103
Forks12
Last commit1 year ago
porter-stemmer
porter-stemmerJavaScript

A Node.js implementation of Martin Porter's stemming algorithm for removing morphological endings from English words.

#commonjs#information-retrieval#natural-language-processing
Stars101
Forks12
Last commit5 years ago
Word Tokenizers
Word TokenizersJulia

A Julia package providing high-performance, configurable tokenizers and sentence splitters for natural language processing.

#julia#computational-linguistics#sentence-splitting
Stars100
Forks25
Last commit4 years ago
UralicNLP
UralicNLPPython

A natural language processing library for Uralic and other languages, offering morphological analysis, generation, lemmatization, and lexical information.

#sami#nlp-library#computational-linguistics
Stars97
Forks7
Last commit2 months ago
ai-cmd
ai-cmdShell

A Zsh plugin that converts natural language descriptions into shell commands using AI, with ghost text preview.

#developer-tools#productivity#ai-assistant
Stars95
Forks18
Last commit24 days ago
GraphQuestions
GraphQuestionsReScript

A characteristic-rich dataset for factoid question answering with explicit question specifications to enable fine-grained QA system evaluation.

#nlp-research#question-answering#natural-language-processing
Stars94
Forks14
Last commit
topik
topikPython

A high-level Python toolbox for topic modeling with easy-to-use functions and command-line interface.

#text-analysis#data-science#natural-language-processing
Stars93
Forks23
Last commit10 years ago
pragmatic_tokenizer
pragmatic_tokenizerRuby

A multilingual Ruby gem for splitting strings into tokens with extensive language support and configurable options.

#nlp-library#text-analysis#multilingual
Stars93
Forks11
Last commit1 year ago
MonkeyLearn
MonkeyLearnR

Archived R package for accessing the Monkeylearn API for text classification and extraction.

#text-extraction#peer reviewed#text-classification
Stars92
Forks16
Last commit4 years ago
ruby-nlp
ruby-nlpRuby

Ruby bindings for Stanford NLP tools providing part-of-speech tagging and named entity recognition capabilities.

#part-of-speech-tagging#nlp-tools#natural-language-processing
Stars92
Forks14
Last commit12 years ago
open-nlp
open-nlpRuby

Ruby bindings to the OpenNLP Java toolkit for natural language processing tasks like tokenization, POS tagging, and named entity recognition.

#java bindings#jruby#pos-tagging
Stars91
Forks11
Last commit1 year ago
punkt-segmenter
punkt-segmenterRuby

A Ruby port of the NLTK Punkt algorithm for unsupervised, language-independent sentence boundary detection.

#nlp-library#sentence-boundaries#nltk
Stars91
Forks9
Last commit8 years ago
TensorFlow Lite Examples - Android
TensorFlow Lite Examples - AndroidKotlin

A collection of refactored, high-quality Android examples demonstrating TensorFlow Lite for on-device machine learning tasks.

#android#model-deployment#minst
Stars90
Forks22
Last commit
segment
segmentGo

A Go library for Unicode text segmentation at word boundaries as defined by Unicode Standard Annex #29.

#unicode#word-boundaries#ragel
Stars89
Forks15
Last commit3 years ago
Hierarchical Attention Networks
Hierarchical Attention NetworksPython

TensorFlow implementation of hierarchical attention networks for document classification using GRU cells and attention mechanisms.

#hierarchical-networks#text-classification#text-analysis
Stars87
Forks25
Last commit
Language Understanding (LUIS) Samples
Language Understanding (LUIS) SamplesC#

A collection of code samples demonstrating how to use Azure's Language Understanding (LUIS) service for natural language processing.

#language-understanding#chatbots#azure
Stars86
Forks135
Last commit
Directus Copilot
Directus CopilotTypeScript

A Directus extension that adds an AI-powered chat interface to query and analyze your data using OpenAI.

#ai-assistant#openai#directus-ai-hackathon
Stars86
Forks4
Last commit2 years ago
Multilingual Latent Dirichlet Allocation LDA
Multilingual Latent Dirichlet Allocation LDAPython

A Python pipeline for multilingual text clustering using Latent Dirichlet Allocation with stop words removal, n-gram features, and inverse stemming.

#n-grams#stemming#multilingual
Stars83
Forks29
Last commit
Embeddings
EmbeddingsJulia

A Julia package for loading pretrained word embeddings like Word2Vec, FastText, and GloVe.

#julia#word2vec#datadeps
Stars83
Forks19
Last commit2 years ago
veritaserum
veritaserumElixir

Simple sentiment analysis for Elixir based on AFINN-165 with emoji, booster, and negator support.

#hex#emoji-analysis#elixir
Stars83
Forks10
Last commit3 years ago
tickle
tickleRuby

A Ruby natural language parser for recurring events that interprets expressions like 'every 2 days' or 'Sundays'.

#parsing#reminders#time
Stars83
Forks12
Last commit5 years ago
node-red-node-watson
node-red-node-watsonHTML

A collection of Node-RED nodes to integrate IBM Watson AI services like speech, language, and conversation into applications.

#language-translation#ai-services#low-code
Stars82
Forks84
Last commit
PreviousPage 7 of 8

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub
20 days ago
6 years ago
11 years ago
3 years ago
3 years ago
3 years ago
7 years ago
3 years ago
1 year ago
4 years ago
Next
#Machine Learning128
#Nlp89
#Text Analysis63
#Deep Learning61
#Python43
#Computer Vision33
#Text Processing32
#Named Entity Recognition31
#Python Library29
#Text Classification24
#Tensorflow23
#Ruby Gem22