Showing 12 of 12 projects
An LLM acceleration library for Intel XPU (GPU, NPU, CPU) to speed up local inference and finetuning of popular models.
An open-source cross-platform performance library of basic building blocks for deep learning applications, optimized for CPUs and GPUs.
A compiler for a C-based SPMD language that generates high-performance SIMD code for CPUs and GPUs.
A portable C++ library providing SIMD vector types for explicit data-parallel programming with zero-overhead abstractions.
A free software AI accelerator that speeds up scikit-learn applications by 10-100x on CPUs and GPUs with no code changes.
A C++ template library providing high-performance SIMD-accelerated sorting algorithms for integers, floats, and custom objects.
A Pascal-based deep learning neural network API optimized for AVX/AVX2/AVX512 and OpenCL, supporting AMD, Intel, and NVIDIA hardware.
A fast, header-only C/C++ library for counting 1 bits in arrays using optimized CPU instructions like POPCNT, AVX2, AVX512, NEON, and SVE.
A fork of OpenAI's Whisper speech recognition models optimized with OpenVINO backend for faster CPU inference.
A deprecated sample comparing OpenGL and Vulkan rendering techniques for CAD scenes using multi-threaded command buffer generation.
A high-performance fork of FastMM4 with AVX/AVX2/AVX512 support, efficient synchronization, and FreePascal compatibility.
A Vulkan sample application that renders 200,000 animated particles using multithreaded draw calls to demonstrate low CPU overhead.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.