Showing 36 of 76 projects
A curated list of resources dedicated to Natural Language Processing (NLP), including libraries, datasets, tutorials, and research.
A Python library for topic modeling, document indexing, and similarity retrieval with large corpora.
A Python library for topic modeling, document indexing, and similarity retrieval with large text corpora.
A comprehensive Python library for natural language processing, providing modules, datasets, and tutorials for NLP research and development.
A comprehensive Node.js library offering a wide range of natural language processing facilities.
A Python web mining module with tools for scraping, NLP, machine learning, network analysis, and visualization.
A linter that catches insensitive, inconsiderate writing in plain text, HTML, Markdown, and MDX.
A naive linter for English prose that helps developers improve their writing by detecting common style issues.
A command-line linter for English prose that checks writing style, grammar, and usage against advice from expert writers.
Detect the language of text with support for up to 419 languages, more than any other library.
A Python library and CLI tool for automatic text summarization using extractive methods like LexRank, LSA, Luhn, and Edmundson.
A Python library for computing distances between sequences with 30+ algorithms, pure Python implementation, and optional external libraries for speed.
A free, state-of-the-art library and toolkit for named entity extraction and binary relation detection from text.
A Python library that automatically detects the character encoding of text files and byte streams with high accuracy and speed.
A natural language processor powered by plugins that transforms and analyzes text using syntax trees.
A Python NLP library built on spaCy for text preprocessing, feature extraction, and analysis tasks.
An easy-to-use, state-of-the-art named-entity recognition (NER) tool based on neural networks.
A comprehensive natural language processing framework for Ruby with support for text extraction, parsing, and machine learning.
The most accurate natural language detection library for Go, excelling with short text and mixed-language content.
A curated collection of Ruby libraries, tools, and resources for Natural Language Processing (NLP).
A command-line tool that performs semantic searches on text using word embeddings to find words with similar meaning to the query.
A Rust library for natural language detection using trigram models, focusing on simplicity and performance.
A curated list of awesome resources, libraries, and tools for natural language processing (NLP) in Ruby.
A Python natural language processing library for pre-modern languages like Latin, Ancient Greek, and Sanskrit.
An R package for the quantitative analysis of textual data, providing comprehensive tools for natural language processing and text management.
An efficient R package for text analysis and NLP with fast vectorization, topic modeling, and word embeddings.
Catalyst is a high-performance C# NLP library inspired by spaCy, offering pre-trained models, entity recognition, and embedding training.
A Ruby gem for calculating text similarity using tf*idf and BM25 vector space models.
A tool for automatically annotating mentions of DBpedia resources in text, linking entities to their global identifiers.
A modern C++ toolkit for text retrieval and analysis, featuring indexing, ranking, topic modeling, classification, and language models.
An R package for joining data frames on inexact matching using string distance, regex, numeric tolerance, and other fuzzy criteria.
A natural language detection library for Go that identifies 84 languages and scripts with no external dependencies.
A fast, open-source platform for topic modeling using Additive Regularization of Topic Models (ARTM).
A Python wrapper and JSON-RPC server for Stanford CoreNLP, providing NLP tools like parsing, tagging, and coreference resolution.
A vector space search engine, vector database, and key/value store for efficient string processing and vector operations.
An R package for creating interactive web-based visualizations of Latent Dirichlet Allocation (LDA) topic models.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.