Showing 36 of 92 projects
A Python library for language-vision intelligence research, providing unified access to state-of-the-art models, datasets, and tasks.
A comprehensive Node.js library offering a wide range of natural language processing facilities.
Fast, state-of-the-art tokenizers for training and tokenization, optimized for both research and production.
Code examples and tutorials for Stanford's TensorFlow for Deep Learning Research course (CS 20).
An automated machine learning library that trains and deploys high-accuracy models for tabular, text, image, and time series data with minimal code.
A Python web mining module with tools for scraping, NLP, machine learning, network analysis, and visualization.
A curated list of resources, tools, datasets, and learning materials for Chinese Natural Language Processing.
A Python NLP library from Stanford for tokenization, sentence segmentation, NER, and dependency parsing across 60+ languages.
A curated collection of machine learning models in Core ML format for iOS, macOS, tvOS, and watchOS developers.
A curated collection of machine learning models in Core ML format for iOS, macOS, tvOS, and watchOS developers.
A multi-domain Chinese word segmentation toolkit offering higher accuracy and domain-specific models.
A modular natural language processing library for Node.js and React Native, designed for building multilingual chatbots and language utilities.
A curated list of resources dedicated to recurrent neural networks (RNNs) and deep learning.
An alphabetical list of free and public domain text datasets for Natural Language Processing (NLP) tasks.
A TensorFlow implementation of a convolutional neural network for sentence classification based on Yoon Kim's paper.
An open-source pipeline for training medical domain GPT models using PT, SFT, RLHF, DPO, ORPO, and GRPO methods.
A JavaScript library for parsing text to extract dates, times, phone numbers, emails, places, and other structured information.
A terminal-based AI assistant that analyzes code, automates workflows, and executes tasks using natural language commands.
A linter that catches insensitive, inconsiderate writing in plain text, HTML, Markdown, and MDX.
A Python module for easily training character- or word-level text-generating neural networks on any dataset with minimal code.
A PyTorch system for open-domain question answering by retrieving and reading documents, originally applied to Wikipedia.
A state-of-the-art Natural Language Processing library built on Apache Spark, offering 100,000+ pretrained models and pipelines in 200+ languages.
A proof-of-concept ChatGPT integration for Unity Editor that allows controlling the editor using natural language prompts.
A curated list of 100 foundational and influential papers in natural language processing for students and researchers.
A Python library and CLI tool for automatic text summarization using extractive methods like LexRank, LSA, Luhn, and Edmundson.
A C#/.NET library for efficient local inference of LLaMA and other large language models, based on llama.cpp.
A Python library for computing distances between sequences with 30+ algorithms, pure Python implementation, and optional external libraries for speed.
Seamlessly integrate large language models like ChatGPT into scikit-learn for enhanced text analysis tasks.
A neural network library optimized for dynamic structures that change per training instance, with C++ and Python bindings.
A visual roadmap and keyword mind map for students learning Natural Language Processing, from basics to SOTA models.
A pure Ruby natural language date/time parser that converts human-readable phrases into structured time objects.
A Rust-native port of Hugging Face Transformers providing ready-to-use NLP pipelines and transformer models like BERT, GPT2, and T5.
A free, state-of-the-art library and toolkit for named entity extraction and binary relation detection from text.
A TensorFlow implementation of a neural conversational model (seq2seq) for building deep learning chatbots.
A lightweight deep learning library with a functional API for composing models, compatible with PyTorch, TensorFlow, and MXNet.
A Go library for efficient multilingual text segmentation and NLP, supporting English, Chinese, Japanese, and more.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.