Cuda

116 projects

Showing 36 of 116 projects

llama.cppC++

A C/C++ library for efficient, cross-platform LLM inference with extensive hardware support and quantization.

#cuda#ggml#metal

Stars121.4k

Forks20.9k

Last commit8 hours ago

vllmPython

A high-throughput, memory-efficient inference and serving engine for large language models (LLMs).

#distributed-inference#transformer#cuda

Stars87.0k

Forks19.8k

Last commit7 hours ago

Caffe Model ZooC++

A fast open framework for deep learning with a focus on expression, speed, and modularity.

#cuda#deep-learning#neural-networks

Stars34.6k

Forks18.4k

Last commit2 years ago

OpenposeC++

Real-time multi-person keypoint detection library for body, face, hands, and foot estimation.

#cuda#pose-estimation#human-behavior-understanding

Stars34.3k

Forks8.0k

Last commit2 years ago

sglangPython

A high-performance serving framework for large language models and multimodal models, delivering low-latency and high-throughput inference.

#transformer#cuda#llm-serving

Stars30.7k

Forks7.4k

Last commit5 hours ago

GitHub repositoryC++

An open-source, high-performance platform for developing, testing, and deploying autonomous vehicles.

#lidar#robotics#autonomous-driving

Stars26.8k

Forks10.0k

Last commit3 months ago

DarknetC

An open source neural network framework in C and CUDA, known for YOLO real-time object detection models.

#cuda#deep-learning#c-language

Stars26.5k

Forks21.0k

Last commit2 years ago

BuzzPython

An offline desktop application for transcribing and translating audio/video files, live recordings, and YouTube links using OpenAI's Whisper.

#vulkan#cuda#desktop-application

Stars20.3k

Forks1.5k

Last commit7 days ago

CNTK - Microsoft Cognitive ToolkitC++

A unified deep learning toolkit for describing neural networks as computational graphs, supporting feed-forward DNNs, CNNs, and RNNs/LSTMs.

#cntk#cognitive-toolkit#cuda

Stars17.6k

Forks4.2k

Last commit3 years ago

nvidia-docker

A deprecated wrapper that enabled Docker containers to access NVIDIA GPU resources.

#nvidia-docker#cuda#container-runtime

Stars17.6k

Forks2.1k

Last commit2 years ago

KaldiShell

A comprehensive open-source toolkit for speech recognition research and development.

#cuda#research-toolkit#speaker-id

Stars15.4k

Forks5.4k

Last commit10 months ago

TensorRTC++

NVIDIA's SDK for high-performance deep learning inference optimization and deployment on NVIDIA GPUs.

#cuda#neural-network#nvidia

Stars13.2k

Forks2.4k

Last commit16 days ago

cupyPython

A NumPy/SciPy-compatible array library for GPU-accelerated computing with Python, supporting NVIDIA CUDA and AMD ROCm.

#cuda#scientific-computing#high-performance-computing

Stars12.2k

Forks1.1k

Last commit7 hours ago

TaskflowC++

A fast, expressive, and header-only C++ library for building task-parallel programs with static, dynamic, and conditional task graphs.

#work-stealing#threadpool#cuda

Stars12.1k

Forks1.4k

Last commit8 days ago

numbaPython

NumPy aware dynamic Python compiler using LLVM

#cuda#compiler#numba

Stars11.1k

Forks1.3k

Last commit9 hours ago

cudfC++

A GPU-accelerated DataFrame library for tabular data processing, part of the RAPIDS data science suite.

#cudf#cuda#apache-arrow

Stars9.7k

Forks1.1k

Last commit8 hours ago

gocvGo

Go language bindings for OpenCV 4, enabling computer vision applications with support for CUDA, DNN, and OpenVINO.

#cuda#video-processing#opencv

Stars7.5k

Forks903

Last commit1 month ago

ChainerPython

A flexible Python deep learning framework using define-by-run dynamic computational graphs for neural network research.

#research-tool#cuda#chainer

Stars5.9k

Forks1.3k

Last commit2 years ago

leafRust

An open-source machine learning framework for building classical, deep, or hybrid ML applications with a focus on performance and portability.

#cuda#opencl#deep-learning

Stars5.5k

Forks268

Last commit2 years ago

cuMLPython

A suite of GPU-accelerated machine learning algorithms with scikit-learn compatible APIs for 10-50x faster performance on large datasets.

#cuda#data-science#nvidia

Stars5.2k

Forks645

Last commit8 hours ago

GitHub repositoryPython

A PyTorch library providing GPU-accelerated tools for 3D deep learning, including differentiable rendering and geometric operations.

#cuda#rasterization#differentiable-lighting

A C++ parallel algorithms library that enables high-performance computing on GPUs and multicore CPUs with a productivity-focused interface.

#cuda#parallel-computing#high-performance-computing

Stars5.0k

Forks760

Last commit2 years ago

NCCLC++

A library of optimized communication primitives for multi-GPU and multi-node collective operations.

#multi-gpu#cuda#distributed-training

Stars4.9k

Forks1.4k

Last commit14 hours ago

ArrayFireC++

A general-purpose tensor library for parallel computing across CPUs, GPUs, and hardware accelerators.

#oneapi#cuda#scientific-computing

Stars4.9k

Forks555

Last commit4 months ago

jetson-containersJupyter Notebook

A modular container build system providing the latest AI/ML packages for NVIDIA Jetson and JetPack-L4T.

#robotics#cuda#ros-containers

Stars4.8k

Forks841

Last commit4 days ago

Warp-CTCCuda

A fast parallel implementation of the Connectionist Temporal Classification (CTC) loss function for CPU and GPU.

#cuda#parallel-computing#torch-binding

Stars4.1k

Forks1.0k

Last commit2 years ago

LygiaGLSL

A granular, multi-language shader library for real-time graphics, supporting GLSL, HLSL, Metal, WGSL, and CUDA.

#cuda#real-time-graphics#library

Stars3.4k

Forks222

Last commit4 months ago

RemoteryC

A realtime CPU/GPU profiler hosted in a single C file with a remote web viewer for performance analysis.

#c-library#vulkan#cuda

Stars3.3k

Forks284

Last commit1 year ago

ViseronPython

Self-hosted, local only NVR and AI Computer Vision software. With features such as object detection, motion detection, face recognition and more, it gives you the power to keep an eye on your home, office or any other place you want to monitor.

#cuda#network video capture#network video recorder

Stars3.3k

Forks403

Last commit

flownet2-pytorchPython

PyTorch implementation of FlowNet 2.0 for optical flow estimation using deep neural networks.

#cuda#deep-learning#neural-networks

Stars3.3k

Forks749

Last commit3 months ago

nnablaPython

A deep learning framework for research, development, and production with flexible Python API and C++ core.

#cuda#model-training#deep-learning

Stars2.8k

Forks336

Last commit10 months ago

KokkosC++

A C++ programming model for writing performance-portable applications targeting all major HPC platforms.

#cuda#sycl#parallel-computing

Stars2.6k

Forks513

Last commit12 hours ago

EGO-PlannerC++

A lightweight gradient-based local planner for quadrotors that eliminates ESDF construction, achieving planning times around 1ms.

#cuda#gradient-based-optimization#real-time-planning

Stars2.6k

Forks404

Last commit1 year ago

darknet_rosC++

A ROS package for real-time object detection in camera images using YOLO (V3) on GPU and CPU.

#robotics#cuda#opencv

Stars2.4k

Forks1.2k

Last commit2 years ago

darknet_rosC++

A ROS package for real-time object detection in camera images using YOLO (V3) on GPU and CPU.

#cuda#camera#autonomous-robots

Stars2.4k

Forks1.2k

Last commit2 years ago

libcudacxxC++

NVIDIA's implementation of the C++ Standard Library for CUDA C++ development.

#cuda#parallel-computing#high-performance-computing

Stars2.3k

Forks194

Last commit2 years ago

Page 1 of 4Next

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub