Showing 36 of 159 projects
A general-purpose GPU compute framework built on Vulkan for cross-vendor graphics cards, enabling high-performance data processing and machine learning.
HyperLearn provides 2-2000x faster machine learning algorithms with 50% less memory usage, optimized for all hardware.
NVIDIA's implementation of the C++ Standard Library for CUDA C++ development.
A Node.js module that implements the Web Worker API using native threads for CPU-bound tasks.
A cross-platform, dependency-free C++ and Python DAG framework for building parallel computational graphs.
A distributed query execution engine that extends Apache DataFusion to run SQL queries in parallel across multiple nodes.
A hybrid thread/fiber task scheduler for C++11 that enables efficient execution of blocking tasks.
A visual workflow-based AI deployment framework for multi-platform and multi-backend inference, supporting large models and edge devices.
A Go toolkit for building concurrent programs using composable, channel-based pipelines with automatic error propagation.
A header-only C++ library for CUDA providing accelerated primitives for solving irregularly parallel problems on GPUs.
A JIT compiler for writing high-performance GPU programs in .NET languages like C#, offering CUDA-level performance with C# convenience.
A high-performance, feature-rich Entity Component System (ECS) library for Rust game development with minimal boilerplate.
A C++ GPU computing library providing an STL-like interface for OpenCL-based parallel programming.
A collection of high-performance GICP-based point cloud registration algorithms with multi-threaded and GPU-accelerated implementations.
A distributed map-reduce framework for parallel computations over large datasets on unreliable computer clusters.
A computational parallel flow library for Elixir built on top of GenStage for parallel processing of collections.
A portable C++ library providing SIMD vector types for explicit data-parallel programming with zero-overhead abstractions.
A C++-based high-performance parallel environment execution engine for vectorized reinforcement learning simulations.
A lightweight concurrency framework for C++11 inspired by Microsoft PPL, providing tasks, parallel algorithms, and schedulers.
A functional programming toolkit for R that enhances data manipulation with consistent, type-stable functions for working with vectors and lists.
A functional programming toolkit for R that enhances data manipulation with consistent, type-stable functions.
A collection of reusable scientific computing software components for solving large-scale, complex multi-physics engineering problems.
A framework for building parallel, multi-disciplinary simulation software, focusing on modularity, extensibility, and high-performance computing.
A C++17 library providing efficient STL-like data structures (vector, unordered_map, etc.) for GPU programming with CUDA, OpenMP, and HIP backends.
A Python library for parallel active learning of mathematical functions, intelligently sampling parameter spaces to minimize evaluations.
High-performance, end-to-end reinforcement learning implementations fully written in JAX for massive parallelization on GPUs.
A microbenchmark that spawns one million concurrent actors/coroutines to compare concurrency performance across programming languages.
A C++ library for parallel text file reading with CSV support and Python bindings.
A reactive programming library for C++14 that enables declarative data dependencies and automatic change propagation.
A tool for running Appium tests in parallel across Android and iOS real devices and simulators.
A C++ template library providing high-performance SIMD-accelerated sorting algorithms for integers, floats, and custom objects.
An evolutionary optimization library for Go implementing genetic algorithms, particle swarm optimization, differential evolution, and other algorithms.
A JAX-based library providing accelerated reinforcement learning environments with full compatibility to the classic gym API.
Thin, unified C++ wrappers for NVIDIA's CUDA APIs (Runtime, Driver, NVRTC, NVTX) that improve safety and ease of use.
An R package for the quantitative analysis of textual data, providing comprehensive tools for natural language processing and text management.
A lock-free, wait-free, continuation-stealing tasking library for C++20 built on coroutines, enabling ultra-fine-grained parallelism.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.