Natural Language Processing

#rule-based#sentence-boundary-detection#ruby-gem

pragmatic_segmenterRuby

A rule-based sentence boundary detection gem for Ruby that works out-of-the-box across many languages.

Stars593

Forks55

Awesome CoreML

A curated collection of open-source machine learning models compatible with Apple's Core ML framework.

#ai#coremltools#ios

Stars587

Forks63

#squad#deep-learning#neural-networks

R-NetPython

TensorFlow implementation of R-Net for machine reading comprehension on the SQuAD dataset.

Stars577

Forks209

Last commit8 years ago

LDAvisJavaScript

An R package for creating interactive web-based visualizations of Latent Dirichlet Allocation (LDA) topic models.

#statistical-visualization#r-package#text-analysis

Stars570

Forks130

#unity3d#hacktoberfest#csharp

unity-sdkC#

A Unity SDK for integrating IBM Watson AI services like speech, language, and vision into games and applications.

Stars565

Forks205

#probabilistic-modeling#factor-graphs#scala-library

FACTORIEScala

A Scala toolkit for deployable probabilistic modeling using imperatively-defined factor graphs.

Stars552

Forks143

Last commit8 years ago

German NLP resources

A curated list of open-access resources and tools for Natural Language Processing (NLP) focused on the German language.

#german-language#computational-linguistics#language-resources

Stars528

Forks67

#narrative-understanding#text-analysis#deep-learning

NarrativeQAShell

A reading comprehension dataset with Wikipedia summaries, full stories, and question-answer pairs for narrative understanding.

Stars518

Forks70

#word2vec#go-library#natural-language-processing

word-embeddingGo

A Go library implementing word embedding models (Word2Vec, GloVe, LexVec) from scratch with CLI and SDK.

Stars507

Forks45

#nltk#nlp-research#deep-learning

RNNLGPython

An open-source benchmark toolkit for Natural Language Generation in spoken dialogue systems, featuring multiple RNN-based models and datasets.

Stars490

Forks126

Last commit7 years ago

Torch code for Visual Question Answering using a CNN+LSTM modelLua

A Torch implementation of a VIS+LSTM model for answering questions about images using deep learning.

#deep-learning#natural-language-processing#research-implementation

A command-line tool that translates plain English requests into terminal commands using AI.

#productivity#ai-assistant#shell-scripting

Stars482

Forks40

#part-of-speech-tagging#cogcomp#java-library

CogCompNLPJava

A comprehensive suite of Java NLP libraries and tools for text annotation, feature extraction, and language processing tasks.

Stars479

Forks143

#proofreading#grammar-checker#ruby-gem

gingericeRuby

A Ruby wrapper for Ginger Proofreader that corrects spelling and grammar mistakes using contextual sentence analysis.

Stars477

Forks21

Last commit7 years ago

nlpGo

A Go library implementing selected machine learning algorithms for natural language processing and semantic analysis.

#semantic-analysis#tf-idf#text-analysis

Stars475

Forks46

Last commit5 years ago

sentencesGo

A multilingual command-line sentence tokenizer written in Go, ported from NLTK's Punkt system.

#nlp-library#sentences#command-line-tool

Stars473

Forks42

#text-classification#text-analysis#ruby-gem

SentimentalRuby

A Ruby gem for simple sentiment analysis that classifies text as positive, negative, or neutral based on configurable thresholds.

Stars465

Forks72

Last commit7 years ago

Biomedical Information Extraction

A curated list of resources for Biomedical Information Extraction (BioIE), including datasets, tools, libraries, and research.

#biomedical-language#biomedical-nlp#biomedical-data

Stars461

Forks40

Last commit2 months ago

Linguistics

A curated list of resources, tools, datasets, and communities for linguistics and natural language processing.

#computational-linguistics#nlp-resources#natural-language-processing

Stars447

Forks34

Last commit5 months ago

GraphifyJava

A Neo4j extension for document and text classification using graph-based hierarchical pattern recognition.

#semantic-analysis#text-classification#neo4j-extension

Stars446

Forks100

#ios#arkit#natural-language-processing

ChatARKitC

An iOS app that uses ChatGPT to generate ARKit code from spoken prompts, placing and manipulating 3D objects in augmented reality.

Stars441

Forks35

#spacy#clinical-text#metamap

medaCyPython

A medical text mining and information extraction framework built on spaCy for rapid prototyping and training of predictive NLP models.

Stars438

Forks92

#ruby-bindings#text-analysis#language-processing

stanford-core-nlpRuby

Ruby bindings for the Stanford CoreNLP natural language processing toolkit, supporting English, French, and German.

Stars436

Forks69

#database#data-scraping#text-corpus

FakeNewsCorpus

A dataset of millions of news articles labeled by credibility type for training fake news detection algorithms.

Stars412

Forks98

#research-tool#squad#spacy

DrQAPython

A PyTorch implementation of the DrQA model for reading comprehension and open-domain question answering.

Stars401

Forks109

Last commit4 years ago

simple_bayesElixir

A Naive Bayes machine learning implementation in Elixir with multiple models and storage options.

#probabilistic-models#naive-bayes#text-classification

Stars396

Forks24

Last commit8 years ago

Text AnalysisJulia

A Julia package providing standard tools and models for text analysis and natural language processing.

#nlp-library#julia#text-classification

Stars384

Forks92

Last commit3 months ago

scriptumJavaScript

A functional programming library for JavaScript/Node.js focused on string processing, regular expressions, and linear algebra.

#functional-programming#transducers#folding

Stars381

Forks20

Last commit11 months ago

Implementation of various topic models in PythonJupyter Notebook

Python implementations of various topic modeling algorithms including LDA, collaborative topic models, and hierarchical Dirichlet processes.

#research-tool#probabilistic-modeling#bayesian-statistics

A Ruby natural language parser for elapsed time that converts human-readable durations to seconds and vice versa.

#time-conversion#datetime-utilities#time-parsing

Stars355

Forks69

#academic-toolkit#dialogue-agents#dialogues

NNDIALPython

An open-source toolkit for building end-to-end trainable task-oriented dialogue models with neural networks.

Stars353

Forks101

Last commit9 years ago

Spanish

A curated collection of linguistic resources, tools, and datasets for Natural Language Processing and Computational Linguistics on Spanish.

#computational-linguistics#pos-tagging#machine-translation

Stars351

Forks42

#computational-linguistics#text-analysis#nlp-datasets

awesome-spanish-nlp

A curated collection of linguistic resources, datasets, and tools for Natural Language Processing and Computational Linguistics on Spanish.

Stars351

Forks42

#hyperparameter-optimization#explainable-artificial-intelligence#python-library

PySS3Python

A Python library for interpretable text classification using the SS3 model, with built-in visualization tools for explainable AI.

A pure Go package for running inference with pre-trained Transformer models from Hugging Face, enabling NLP tasks without external languages.

#text-classification#transformer-models#machine-translation

Stars330

Forks27