Showing 35 of 35 projects
A model-definition framework for state-of-the-art machine learning models across text, vision, audio, and multimodal tasks.
An open platform for training, serving, and evaluating large language model based chatbots.
A transformer-based text-to-audio model that generates realistic multilingual speech, music, and sound effects.
Minimal inference code for running FLUX.1 open-weight models for image generation and editing.
An open-source framework for financial large language models, enabling cost-effective fine-tuning for tasks like sentiment analysis and forecasting.
A comprehensive library for post-training foundation models using reinforcement learning and fine-tuning techniques.
An open-source Java library that simplifies integrating LLMs into Java applications through a unified API and comprehensive toolbox.
An open-source Python toolkit for speaker diarization with state-of-the-art pretrained models and pipelines.
A fast, flexible, and hardware-aware LLM inference engine with zero-config support for any Hugging Face model.
An open-source pipeline for training medical domain GPT models using PT, SFT, RLHF, DPO, ORPO, and GRPO methods.
A domain-specific generative language model pre-trained on biomedical literature for text generation and mining tasks.
A free course teaching diffusion models theory and hands-on implementation using Hugging Face's Diffusers library.
A free course teaching how to design, train, and deploy a production-ready real-time financial advisor LLM system using RAG and LLMOps.
A Rust-native port of Hugging Face Transformers providing ready-to-use NLP pipelines and transformer models like BERT, GPT2, and T5.
A multimodal protein language model for generative protein design and engineering by jointly reasoning over sequence, structure, and function.
A JAX/Flax-based framework for easy and scalable pre-training, fine-tuning, evaluation, and serving of large language models.
An accelerated machine learning framework for Go, offering a PyTorch/Jax/TensorFlow-like experience with support for CPUs, GPUs, TPUs, and WASM.
State-of-the-art pre-trained transformer language models for protein sequences, enabling tasks like structure prediction and function annotation.
An open-source study on neural question generation using transformers, providing simplified training and inference pipelines.
A collection of transformer-based foundation models for genomics and transcriptomics, enabling tasks like sequence analysis, functional prediction, and conversational DNA exploration.
A long-range genomic foundation model that processes DNA sequences up to 1 million nucleotides at single nucleotide resolution.
A BERT model pre-trained on PubMed abstracts and clinical notes for biomedical natural language processing tasks.
A collection of BERT-like transformer models pre-trained on chemical SMILES data for drug design and property prediction.
A foundation model for multi-species genome understanding, achieving state-of-the-art performance on 28 genomic tasks.
A command-line interface for blazingly fast audio transcription using optimized Whisper ASR models.
A collection of genomic language models for predicting variant effects and evolutionary constraints from DNA sequences.
A pure Go package for running inference with pre-trained Transformer models from Hugging Face, enabling NLP tasks without external languages.
Extract and index knowledge from websites, PDFs, docs, and YouTube to power Q&A sessions using GPT and other language models.
An AI-powered completion source for nvim-cmp that integrates with multiple AI backends for code suggestions.
Ankh is a state-of-the-art protein language model for general-purpose protein modeling and engineering tasks.
A bi-directional equivariant transformer for long-range DNA sequence modeling, enabling reverse-complement aware genomic analysis.
A minimalistic C++ Jinja templating engine specifically designed for LLM chat templates, used in llama.cpp and other projects.
A T5-based model for bidirectional translation between molecular structures (SMILES) and natural language descriptions.
A collection of pre-trained BERT, DistilBERT, ELECTRA, GPT-2, and ConvBERT models for multiple languages, including German, Italian, Turkish, and historic texts.
An open-source prompt guard model that detects prompt injection attacks while mitigating over-defense against benign inputs.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.