Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Tags
  3. Natural Language Processing

Natural Language Processing

268 projects

Showing 36 of 268 projects

Question Generation using hugstransformers
Question Generation using hugstransformersJupyter Notebook

An open-source study on neural question generation using transformers, providing simplified training and inference pipelines.

#transformer#deep-learning#text-generation
Stars1.1k
Forks349
Last commit
whatlang-rs
whatlang-rsRust

A Rust library for natural language detection using trigram models, focusing on simplicity and performance.

#nlp-library#ai#language-recognition
Stars1.1k
Forks119
Last commit5 months ago
NLP with Ruby
NLP with RubyRuby

A curated list of awesome resources, libraries, and tools for natural language processing (NLP) in Ruby.

#computational-linguistics#ruby-gems#pos-tag
Stars1.1k
Forks70
Last commit2 years ago
Awesome NLP with Ruby
Awesome NLP with RubyRuby

A curated list of awesome resources, libraries, and tools for natural language processing (NLP) in Ruby.

#computational-linguistics#ruby-gems#text-analysis
Stars1.1k
Forks70
Last commit2 years ago
This Word Does Not Exist
This Word Does Not ExistPython

A GPT-2 variant that generates plausible fake words, definitions, and usage examples from scratch.

#natural-language-understanding#creative-ai#text-generation
Stars1.0k
Forks84
Last commit
QANet
QANetPython

A TensorFlow implementation of QANet for machine reading comprehension on the SQuAD dataset.

#squad#deep-learning#neural-networks
Stars985
Forks298
Last commit8 years ago
KerasNLP
KerasNLPPython

A pretrained modeling library for Keras 3 offering simple, flexible, and fast access to models for text, image, and audio tasks.

#jax#keras-3#deep-learning
Stars984
Forks341
Last commit5 days ago
karthinkncode's Datasets for Natural Language Processing
karthinkncode's Datasets for Natural Language Processing

A collaboratively maintained, reverse-chronological list of datasets and corpora for natural language processing tasks.

#ai-training-data#nlp-research#research-tools
Stars918
Forks249
Last commit6 years ago
CLTK
CLTKPython

A Python natural language processing library for pre-modern languages like Latin, Ancient Greek, and Sanskrit.

#latin#ai#spacy
Stars907
Forks338
Last commit3 months ago
Show, Attend and Tell
Show, Attend and TellJupyter Notebook

TensorFlow implementation of an attention-based neural image caption generator that focuses on relevant image parts while generating words.

#deep-learning#show-attend-and-tell#neural-networks
Stars905
Forks320
Last commit
quanteda
quantedaR

An R package for the quantitative analysis of textual data, providing comprehensive tools for natural language processing and text management.

#computational-linguistics#parallel-computing#r-package
Stars883
Forks191
Last commit1 day ago
text2vec
text2vecR

An efficient R package for text analysis and NLP with fast vectorization, topic modeling, and word embeddings.

#parallel-computing#word2vec#r-package
Stars874
Forks133
Last commit6 months ago
swift-sdk
swift-sdkSwift

Swift SDK for integrating IBM Watson AI services like speech, language, and assistant into iOS and Linux applications.

#hacktoberfest#language-translation#ai-services
Stars870
Forks214
Last commit1 year ago
yai
yaiGo

An AI-powered terminal assistant that uses OpenAI ChatGPT to generate and run commands from natural language descriptions.

#developer-tools#gpt-3#productivity
Stars866
Forks59
Last commit1 year ago
PIXIU
PIXIUJupyter Notebook

An open-source suite featuring financial large language models (FinMA), instruction datasets (FIT), and evaluation benchmarks (FinBen) for financial AI.

#stock-price-prediction#financial-ai#instruction-tuning
Stars865
Forks116
Last commit1 year ago
opsdroid
opsdroidPython

An open-source Python framework for building chat-ops bots that connect chat services, natural language APIs, and third-party services.

#event-driven#asyncio#botkit
Stars864
Forks426
Last commit1 month ago
Catalyst
CatalystC#

Catalyst is a high-performance C# NLP library inspired by spaCy, offering pre-trained models, entity recognition, and embedding training.

#natural-language-understanding#ai#text-analysis
Stars853
Forks84
Last commit5 days ago
Chapyter
ChapyterPython

A JupyterLab extension that integrates GPT-4 as a code interpreter, translating natural language to Python and executing it automatically.

#jupyterlab-extension#data-science#productivity-tools
Stars831
Forks69
Last commit2 years ago
bayesian
bayesianGo

A Go library for naive Bayesian classification and TF-IDF calculations on string sets.

#tf-idf#naive-bayes#text-classification
Stars812
Forks128
Last commit6 months ago
tf-idf-similarity
tf-idf-similarityRuby

A Ruby gem for calculating text similarity using tf*idf and BM25 vector space models.

#information-retrieval#tf-idf#text-analysis
Stars781
Forks62
Last commit2 years ago
Question Answering
Question Answering

A curated list of resources for Question Answering (QA), covering machine learning, deep learning, datasets, and research.

#squad#nlp-resources#information-retrieval
Stars769
Forks104
Last commit4 years ago
Alsentzer et al Clinical BERT
Alsentzer et al Clinical BERTPython

Pre-trained BERT models fine-tuned on clinical text from MIMIC for medical natural language processing tasks.

#medical-ai#bert-embeddings#biobert
Stars768
Forks151
Last commit5 years ago
DBPedia Spotlight
DBPedia SpotlightScala

A tool for automatically annotating mentions of DBpedia resources in text, linking entities to their global identifiers.

#content-tagging#text-analysis#semantic-annotation
Stars759
Forks192
Last commit8 years ago
DNABERT
DNABERTPython

A pre-trained BERT model designed for DNA sequence analysis, enabling genome understanding tasks like classification and motif discovery.

#transformer-model#kmer#deep-learning
Stars756
Forks179
Last commit4 months ago
CardMagic-Classifier
CardMagic-ClassifierRuby

A Ruby library for text classification with Bayesian, LSI, logistic regression, k-NN, and TF-IDF algorithms.

#text-classification#machine-learning-algorithms#bayesian-classification
Stars718
Forks126
Last commit
MeTA
MeTAC++

A modern C++ toolkit for text retrieval and analysis, featuring indexing, ranking, topic modeling, classification, and language models.

#information-retrieval#text-classification#graph-algorithms
Stars714
Forks237
Last commit3 years ago
BioBERT
BioBERT

Pre-trained biomedical language representation model for biomedical text mining tasks like named entity recognition and relation extraction.

#relation-extraction#biomedical-nlp#transfer-learning
Stars705
Forks91
Last commit6 years ago
hashbrown
hashbrownTypeScript

A framework for building AI agents that run in the browser, with support for Angular and React.

#ai#llm-integration#openai
Stars704
Forks63
Last commit3 months ago
whatlanggo
whatlanggoGo

A natural language detection library for Go that identifies 84 languages and scripts with no external dependencies.

#multilingual-support#text-analysis#script-recognition
Stars688
Forks69
Last commit3 years ago
BigARTM
BigARTMC++

A fast, open-source platform for topic modeling using Additive Regularization of Topic Models (ARTM).

#additive-regularization#sparse-modeling#python-library
Stars674
Forks121
Last commit4 months ago
Awesome-Torch (Repository on GitHub)
Awesome-Torch (Repository on GitHub)

A curated list of awesome Torch tutorials, projects, libraries, and communities for deep learning.

#deep-learning#neural-networks#research-tools
Stars650
Forks138
Last commit8 years ago
Awesome Video Text Retrieval
Awesome Video Text Retrieval

A curated list of deep learning resources for video-text retrieval, including papers, implementations, and datasets.

#video-retrieval#cross-modal-retrieval#research-papers
Stars645
Forks69
Last commit2 years ago
cookiecutter-spacy-fastapi
cookiecutter-spacy-fastapiPython

A cookiecutter template for deploying spaCy NLP models as FastAPI services compatible with Azure Search Custom Skills.

#fastapi#spacy#api-template
Stars619
Forks61
Last commit3 years ago
Stanford.NLP for .NET
Stanford.NLP for .NETC#

A .NET wrapper for Stanford CoreNLP providing natural language processing capabilities including tokenization, parsing, and named entity recognition.

#ikvm#pos-tagging#recompiled-packages
Stars610
Forks117
Last commit2 years ago
stanford-corenlp-python
stanford-corenlp-pythonPython

A Python wrapper and JSON-RPC server for Stanford CoreNLP, providing NLP tools like parsing, tagging, and coreference resolution.

#parsing#json-rpc#coreference-resolution
Stars610
Forks226
Last commit
pragmatic_segmenter
pragmatic_segmenterRuby

A rule-based sentence boundary detection gem for Ruby that works out-of-the-box across many languages.

#rule-based#sentence-boundary-detection#ruby-gem
Stars593
Forks55
Last commit1 year ago
PreviousPage 4 of 8

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub
2 years ago
5 days ago
7 years ago
7 days ago
8 years ago
Next
#Machine Learning128
#Nlp89
#Text Analysis63
#Deep Learning61
#Python43
#Computer Vision33
#Text Processing32
#Named Entity Recognition31
#Python Library29
#Text Classification24
#Tensorflow23
#Ruby Gem22