Showing 9 of 9 projects
A unified deep learning system for efficient large-scale model training and inference with advanced parallelism strategies.
An industrial deep learning framework supporting unified dynamic/static graphs, automatic parallelism, and integrated training/inference for large models.
An industrial deep learning framework from China supporting unified dynamic/static graphs, automatic parallelism, and integrated training/inference for large models.
A fast, expressive, and header-only C++ library for building task-parallel programs with static, dynamic, and conditional task graphs.
A C++ GPU computing library providing an STL-like interface for OpenCL-based parallel programming.
A comprehensive GPU tool suite for debugging, profiling, and analyzing OpenCL, OpenGL, Vulkan, and DirectX kernels and shaders.
A high-productivity C++ library for parallel programming across devices using Data Parallel C++ (DPC++) APIs.
A C++ template library optimized for GPUs providing high-performance implementations of common algorithms like scan, reduce, transform, and sort.
A Clojure library for parallel computations using OpenCL 2.0 with fast JNI bindings.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.