Showing 30 of 66 projects
An open-source library for training and deploying deep learning recommendation models with sparse data at scale using multi-GPU support.
A lightweight, single-binary Rust inference server providing 100% OpenAI-API compatible endpoints for local GGUF models.
An open-source cross-platform performance library of basic building blocks for deep learning applications, optimized for CPUs and GPUs.
A highly efficient, scalable Gaussian process library implemented in PyTorch with GPU acceleration and modular design.
Intel's reference deep learning framework designed for high performance across CPUs, GPUs, and custom hardware.
Fast Python library for collaborative filtering recommendation algorithms on implicit feedback datasets.
A modular library for Bayesian optimization built on PyTorch, enabling efficient optimization of expensive black-box functions.
A high-performance string library leveraging SIMD and SWAR to accelerate search, hashing, sorting, and edit distances across C, C++, Python, Rust, and more.
A fast, differentiable physics engine built with JAX for massively parallel rigid body simulation on accelerator hardware.
A WebGPU-accelerated TypeScript charting library for rendering millions of data points at 60 FPS with interactive dashboards.
A lightweight, open-source 2D game engine for ActionScript 3 that leverages GPU acceleration via Stage3D for cross-platform deployment.
An audio library for PyTorch providing data manipulation, transformations, and dataset loaders for machine learning applications.
A Swift framework for GPU-accelerated image and video processing on Apple platforms using Metal.
A lightweight probabilistic programming library using NumPy and JAX for autograd and JIT compilation to GPU/TPU/CPU.
A general-purpose GPU compute framework built on Vulkan for cross-vendor graphics cards, enabling high-performance data processing and machine learning.
A Python library for loading, shaping, embedding, and exploring large graphs with GPU-accelerated visualization and analytics.
HyperLearn provides 2-2000x faster machine learning algorithms with 50% less memory usage, optimized for all hardware.
A scalable multi-platform physics simulation SDK for real-time collision detection, rigid body dynamics, and character controllers.
NVIDIA's implementation of the C++ Standard Library for CUDA C++ development.
A GPU-accelerated image and video processing framework for Apple platforms built on Metal.
An open-source library of high-performance, high-quality denoising filters for ray-traced images using deep learning.
A deep learning library for Rust featuring shape-checked tensors and neural networks with compile-time safety.
A deep learning library in Rust featuring shape-checked tensors and neural networks with compile-time safety.
A Python library implementing multiple alpha matting algorithms for extracting foreground objects from images.
A fast 2kB low-level WebGL library for GPU-accelerated particle systems and high-performance visual effects.
A fork of Emacs that adds modern features like TypeScript/JavaScript support via Deno, GPU-accelerated rendering with WebRender, and improved async I/O.
An abstraction layer over MetalPerformanceShaders for crafting and running fast neural networks on iOS using TensorFlow models.
A JIT compiler for writing high-performance GPU programs in .NET languages like C#, offering CUDA-level performance with C# convenience.
A fast Support Vector Machine (SVM) library that leverages GPUs and multi-core CPUs for high-performance machine learning.
A write-once-run-anywhere GPGPU library for Rust that abstracts WebGPU for CUDA-like compute with portability across desktop, mobile, and browser.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.