Showing 9 of 9 projects
A C/C++ library for efficient, cross-platform LLM inference with extensive hardware support and quantization.
A high-throughput, memory-efficient inference and serving engine for large language models (LLMs).
A high-performance serving framework for large language models and multimodal models, delivering low-latency and high-throughput inference.
A fast, memory-efficient reimplementation of OpenAI's Whisper speech-to-text model using CTranslate2.
A minimalist, high-performance machine learning framework for Rust with a focus on serverless inference and GPU support.
A low-level tensor library for machine learning with integer quantization, automatic differentiation, and zero runtime allocations.
An LLM acceleration library for Intel XPU (GPU, NPU, CPU) to speed up local inference and finetuning of popular models.
A fast, flexible, and hardware-aware LLM inference engine with zero-config support for any Hugging Face model.
A JPEG encoder library that improves compression efficiency for higher quality and smaller file sizes.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.