Nlp Library

33 projects

Showing 33 of 33 projects

Industrial-strength Natural Language Processing library for Python, featuring pretrained pipelines for 70+ languages and production-ready training.

#nlp-library#ai#spacy

Stars33.8k

Forks4.7k

Last commit2 months ago

NLP CompromiseJavaScript

A lightweight JavaScript library for natural language processing that transforms text into structured data with a modest, pragmatic approach.

#part-of-speech-tagging#nlp-library#plugin-system

Fast, state-of-the-art tokenizers for training and tokenization, optimized for both research and production.

#nlp-library#natural-language-understanding#unigram

Stars10.9k

Forks1.2k

Last commit22 hours ago

nlp.jsJavaScript

A modular natural language processing library for Node.js and React Native, designed for building multilingual chatbots and language utilities.

#bots#nlp-library#entity-extraction

Stars6.6k

Forks631

Last commit1 year ago

textacyPython

A Python NLP library built on spaCy for text preprocessing, feature extraction, and analysis tasks.

#nlp-library#computational-linguistics#spacy

Stars2.2k

Forks247

Last commit2 years ago

whatlang-rsRust

A Rust library for natural language detection using trigram models, focusing on simplicity and performance.

#nlp-library#ai#language-recognition

Stars1.1k

Forks119

Last commit7 months ago

kagomeGo

A self-contained Japanese morphological analyzer written in pure Go, tokenizing text into words and analyzing parts of speech.

#part-of-speech-tagging#nlp-library#hacktoberfest

Stars974

Forks60

Last commit22 hours ago

sentencesGo

A multilingual command-line sentence tokenizer written in Go, ported from NLTK's Punkt system.

#nlp-library#sentences#command-line-tool

Stars473

Forks42

Last commit2 years ago

Text AnalysisJulia

A Julia package providing standard tools and models for text analysis and natural language processing.

#nlp-library#julia#text-classification

Stars384

Forks92

Last commit3 months ago

ChalkScala

A Scala library for natural language processing with functional and actor-based pipelines.

#nlp-library#functional-programming#pipeline-architecture

Stars260

Forks48

Last commit9 years ago

BLLIP ParserGAP

BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.

#parsing#nlp-library#ai

Stars227

Forks53

Last commit4 years ago

CadmiumJust

A comprehensive Natural Language Processing (NLP) library for the Crystal programming language.

#readability#nlp-library#modular-architecture

Stars211

Forks14

Last commit6 months ago

words_countedRuby

A Ruby natural language processor for tokenizing and analyzing text with flexible filtering and custom regex support.

#nlp-library#word-counter#text-analysis

Stars164

Forks28

Last commit4 years ago

QtypesJavaScript

A rule-based question classification system for Node.js that categorizes questions by type and answer format.

#qa-systems#nlp-library#text-analysis

Stars160

Forks27

Last commit9 years ago

stemmerElixir

An English (Porter2) stemming implementation in Elixir for reducing words to their base forms.

#nlp-library#elixir#information-retrieval

Stars154

Forks10

Last commit2 years ago

UralicNLPPython

A natural language processing library for Uralic and other languages, offering morphological analysis, generation, lemmatization, and lexical information.

#sami#nlp-library#computational-linguistics

Stars100

Forks8

Last commit4 months ago

pragmatic_tokenizerRuby

A multilingual Ruby gem for splitting strings into tokens with extensive language support and configurable options.

#nlp-library#text-analysis#multilingual

Stars93

Forks11

Last commit1 year ago

rwordnetRuby

A pure Ruby interface to the WordNet database

#nlp-library#wordnet#ruby

Stars91

Forks28

Last commit7 years ago

punkt-segmenterRuby

A Ruby port of the NLTK Punkt algorithm for unsupervised, language-independent sentence boundary detection.

#nlp-library#sentence-boundaries#nltk

Stars91

Forks9

Last commit8 years ago

uctoC++

A rule-based Unicode tokenizer that separates words from punctuation and splits sentences for NLP preprocessing.

#nlp-library#computational-linguistics#rule-based

Stars72

Forks14

Last commit1 month ago

ruby-spacyRuby

A Ruby wrapper for the spaCy NLP library via PyCall, enabling tokenization, POS tagging, NER, and OpenAI integration.

#parsing#nlp-library#spacy

Stars68

Forks6

Last commit3 days ago

gibranElixir

An Elixir natural language processor for tokenization, counting, and string similarity analysis.

Stars65

Forks3

Last commit9 years ago

stemmerGo

A Go package providing English, German, and Dutch stemmers for natural language processing.

#german-language#nlp-library#stemming

Stars56

Forks7

Last commit9 years ago

python-uctoCython

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

#nlp-library#computational-linguistics#text-processing

Stars32

Forks5

Last commit