Showing 19 of 19 projects
A platform to run, manage, and serve open-source large language models (LLMs) locally or on your own infrastructure.
A high-throughput, memory-efficient inference and serving engine for large language models (LLMs).
An open platform for training, serving, and evaluating large language model based chatbots.
A minimalist, high-performance machine learning framework for Rust with a focus on serverless inference and GPU support.
A composable, modular, and scalable machine learning toolkit for building AI platforms on Kubernetes.
A low-code declarative framework for building custom LLMs, neural networks, and other AI models with YAML configurations.
An open-source inference serving platform for deploying AI models from multiple frameworks across cloud, data center, and edge devices.
A practical booklet covering the four main steps of designing machine learning systems with 27 interview questions.
A Python library for building production-ready model inference APIs, job queues, and multi-model serving systems for AI applications.
A platform for deploying, managing, and scaling machine learning models in production on AWS infrastructure.
A fast, flexible, and hardware-aware LLM inference engine with zero-config support for any Hugging Face model.
An open-source MLOps/LLMOps suite for experiment management, data management, pipelines, orchestration, scheduling, and model serving.
A curated collection of resources for building, training, serving, and optimizing production-grade Large Language Model applications.
An MLOps framework to package, deploy, monitor, and manage thousands of production machine learning models on Kubernetes.
An open-source deep learning API and server written in C++ that supports multiple backends like PyTorch, TensorRT, and TensorFlow for training and inference.
A JAX/Flax-based framework for easy and scalable pre-training, fine-tuning, evaluation, and serving of large language models.
A Go library that simplifies TensorFlow's Go bindings with method chaining, automatic scoping, and type conversion.
A command-line tool for creating reproducible, container-based development environments for AI/ML workflows.
A visual workflow-based AI deployment framework for multi-platform and multi-backend inference, supporting large models and edge devices.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.