Quantization

9 projects

Showing 9 of 9 projects

llama.cppC++

A C/C++ library for efficient, cross-platform LLM inference with extensive hardware support and quantization.

A high-throughput, memory-efficient inference and serving engine for large language models (LLMs).

#distributed-inference#transformer#cuda

A high-performance serving framework for large language models and multimodal models, delivering low-latency and high-throughput inference.

#transformer#cuda#llm-serving

A fast, memory-efficient reimplementation of OpenAI's Whisper speech-to-text model using CTranslate2.

#transformer#ai#python-library

Stars22.4k

Forks1.8k

Last commit5 months ago

candle-wasm-examplesRust

A minimalist, high-performance machine learning framework for Rust with a focus on serverless inference and GPU support.

#quantization#neural-networks#model-serving

A low-level tensor library for machine learning with integer quantization, automatic differentiation, and zero runtime allocations.

#tensor-library#quantization#c

Stars14.5k

Forks1.6k

Last commit2 days ago

BigDLPython

An LLM acceleration library for Intel XPU (GPU, NPU, CPU) to speed up local inference and finetuning of popular models.

#finetuning#llm-acceleration#cpu-optimization

Stars8.8k

Forks1.4k

Last commit2 months ago

mistral.rsRust

A fast, flexible, and hardware-aware LLM inference engine with zero-config support for any Hugging Face model.

#agentic-ai#quantization#llm

Stars7.0k

Forks581

Last commit9 days ago

mozjpegC

A JPEG encoder library that improves compression efficiency for higher quality and smaller file sizes.

#jpeg-encoder#mozjpeg#lossless-optimization

Stars5.7k

Forks435

Last commit10 months ago

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub