Showing 36 of 268 projects
A Python library for language-vision intelligence research, providing unified access to state-of-the-art models, datasets, and tasks.
A comprehensive Node.js library offering a wide range of natural language processing facilities.
Fast, state-of-the-art tokenizers for training and tokenization, optimized for both research and production.
An automated machine learning library that trains and deploys high-accuracy models for tabular, text, image, and time series data with minimal code.
Code examples and tutorials for Stanford's TensorFlow for Deep Learning Research course (CS 20).
A Python web mining module with tools for scraping, NLP, machine learning, network analysis, and visualization.
A curated list of resources, tools, datasets, and learning materials for Chinese Natural Language Processing.
A Python NLP library from Stanford for tokenization, sentence segmentation, NER, and dependency parsing across 60+ languages.
A curated collection of machine learning models in Core ML format for iOS, macOS, tvOS, and watchOS developers.
A curated collection of machine learning models in Core ML format for iOS, macOS, tvOS, and watchOS developers.
A multi-domain Chinese word segmentation toolkit offering higher accuracy and domain-specific models.
A modular natural language processing library for Node.js and React Native, designed for building multilingual chatbots and language utilities.
A curated list of resources dedicated to recurrent neural networks (RNNs) and deep learning.
An alphabetical list of free and public domain text datasets for Natural Language Processing (NLP) tasks.
A TensorFlow implementation of a convolutional neural network for sentence classification based on Yoon Kim's paper.
An open-source pipeline for training medical domain GPT models using PT, SFT, RLHF, DPO, ORPO, and GRPO methods.
A JavaScript library for parsing text to extract dates, times, phone numbers, emails, places, and other structured information.
A terminal-based AI assistant that analyzes code, automates workflows, and executes tasks using natural language commands.
A linter that catches insensitive, inconsiderate writing in plain text, HTML, Markdown, and MDX.
A Python module for easily training character- or word-level text-generating neural networks on any dataset with minimal code.
A PyTorch system for open-domain question answering by retrieving and reading documents, originally applied to Wikipedia.
A state-of-the-art Natural Language Processing library built on Apache Spark, offering 100,000+ pretrained models and pipelines in 200+ languages.
A proof-of-concept ChatGPT integration for Unity Editor that allows controlling the editor using natural language prompts.
A curated list of 100 foundational and influential papers in natural language processing for students and researchers.
A C#/.NET library for efficient local inference of LLaMA and other large language models, based on llama.cpp.
A Python library and CLI tool for automatic text summarization using extractive methods like LexRank, LSA, Luhn, and Edmundson.
A Python library for computing distances between sequences with 30+ algorithms, pure Python implementation, and optional external libraries for speed.
Seamlessly integrate large language models like ChatGPT into scikit-learn for enhanced text analysis tasks.
A neural network library optimized for dynamic structures that change per training instance, with C++ and Python bindings.
A visual roadmap and keyword mind map for students learning Natural Language Processing, from basics to SOTA models.
A pure Ruby natural language date/time parser that converts human-readable phrases into structured time objects.
A Rust-native port of Hugging Face Transformers providing ready-to-use NLP pipelines and transformer models like BERT, GPT2, and T5.
A free, state-of-the-art library and toolkit for named entity extraction and binary relation detection from text.
A TensorFlow implementation of a neural conversational model (seq2seq) for building deep learning chatbots.
A lightweight deep learning library with a functional API for composing models, compatible with PyTorch, TensorFlow, and MXNet.
A Go library for efficient multilingual text segmentation and NLP, supporting English, Chinese, Japanese, and more.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.