Inference

16 projects

Showing 16 of 16 projects

vllmPython

A high-throughput, memory-efficient inference and serving engine for large language models (LLMs).

#distributed-inference#transformer#cuda

High-performance C/C++ port of OpenAI's Whisper for efficient, cross-platform speech recognition.

#transformer#ggml#offline

Stars48.9k

Forks5.4k

Last commit4 days ago

Bindings for many languagesC++

A high-performance C/C++ port of OpenAI's Whisper model for efficient, cross-platform speech recognition.

#transformer#automatic-speech-recognition#openai

Stars48.9k

Forks5.4k

Last commit4 days ago

Colossal-AI - An Integrated Large-scale Model Training System with Efficient Parallelization TechniquesPython

A unified deep learning system for efficient large-scale model training and inference with advanced parallelism strategies.

#big-model#distributed-training#ai

Cross-platform framework for building customizable on-device machine learning pipelines for live and streaming media.

#media-processing#video-processing#on-device-ml

A high-performance serving framework for large language models and multimodal models, delivering low-latency and high-throughput inference.

#transformer#cuda#llm-serving

A high-performance neural network inference framework optimized for mobile platforms, enabling efficient AI deployment on edge devices.

#vulkan#ios#ncnn

Stars23.1k

Forks4.4k

Last commit2 days ago

faster-whisperPython

A fast, memory-efficient reimplementation of OpenAI's Whisper speech-to-text model using CTranslate2.

#transformer#ai#python-library

Stars22.4k

Forks1.8k

Last commit5 months ago

ONNXPython

An open standard format for representing machine learning models to enable interoperability between frameworks.

#neural-network#deep-learning#neural-networks

Stars20.7k

Forks3.9k

Last commit3 days ago

ts-patternTypeScript

An exhaustive pattern matching library for TypeScript with smart type inference and expressive API.

#matching#functional-programming#type-safety

Stars15.0k

Forks163

Last commit14 days ago

ggmlC++

A low-level tensor library for machine learning with integer quantization, automatic differentiation, and zero runtime allocations.

#tensor-library#quantization#c

Stars14.5k

Forks1.6k

Last commit2 days ago

TensorRTC++

NVIDIA's SDK for high-performance deep learning inference optimization and deployment on NVIDIA GPUs.

#cuda#neural-network#nvidia

Stars12.9k

Forks2.3k

Last commit10 days ago

triton-inference-serverPython

An open-source inference serving platform for deploying AI models from multiple frameworks across cloud, data center, and edge devices.

#inference-serving#datacenter#deep-learning

SDKs for adding private, on-device AI features like LLM chat, speech-to-text, and text-to-speech to mobile and web apps.

#ios#on-device-ai#android

Stars10.3k

Forks352

Last commit2 days ago

simple-statisticsJavaScript

A lightweight, dependency-free JavaScript library for descriptive, regression, and inference statistics.

#statistics#math#inference

Stars3.5k

Forks231

Last commit1 month ago

nnablaPython

A deep learning framework for research, development, and production with flexible Python API and C++ core.

#cuda#model-training#deep-learning

Stars2.8k

Forks335

Last commit7 months ago

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub