Showing 7 of 7 projects
A high-throughput, memory-efficient inference and serving engine for large language models (LLMs).
A unified deep learning system for efficient large-scale model training and inference with advanced parallelism strategies.
An open-source inference serving platform for deploying AI models from multiple frameworks across cloud, data center, and edge devices.
An LLM acceleration library for Intel XPU (GPU, NPU, CPU) to speed up local inference and finetuning of popular models.
A Python library for building production-ready model inference APIs, job queues, and multi-model serving systems for AI applications.
A collection of best practices and guidelines for optimizing Vulkan applications on mobile devices with Arm GPUs.
A PyTorch implementation of TResNet, a high-performance convolutional neural network architecture optimized for GPU training and inference.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.