Showing 36 of 253 projects
A web application for training deep learning models with a focus on computer vision tasks.
A modular Python toolbox for state-of-the-art 6-DoF visual localization using hierarchical image retrieval and feature matching.
A pure JavaScript library for reading QR codes from raw image data in browsers and Node.js.
An open-source solution for continuous validation of machine learning models and data, from research to production.
A curated list of resources for action recognition, video understanding, object detection, and pose estimation in computer vision.
A local-first, ML-powered desktop application for translating manga, built in Rust with automated text detection, OCR, inpainting, and LLM translation.
A curated list of satellite and aerial imagery datasets with annotations for computer vision and deep learning tasks.
A JAX library for rapid prototyping of large-scale attention-based vision models across images, video, audio, and multimodal data.
A cross-platform userspace driver for the Microsoft Kinect, providing access to RGB/depth images, motors, accelerometer, LED, and audio.
A Python library for self-supervised learning on images, providing a modular PyTorch-like framework with support for modern SSL models.
A curated collection of research papers and resources on Vision Transformers (ViT) for computer vision tasks.
A neural network that automatically adds color to grayscale images using deep learning techniques.
A pure JavaScript OCR engine compiled from Ocrad via Emscripten for client-side text recognition in the browser.
A procedural Blender pipeline for generating photorealistic training images for computer vision and machine learning.
A PyTorch-based framework for visual object tracking and video object segmentation, featuring implementations of state-of-the-art trackers like TaMOs, RTS, and DiMP.
Implementation of SRGAN for photo-realistic single image super-resolution using generative adversarial networks.
Automatic and interactive image colorization using deep neural networks, with PyTorch models for ECCV 2016 and SIGGRAPH 2017 papers.
An open-source C++ framework for optimizing graph-based nonlinear error functions, widely used in robotics and computer vision.
An open-source photogrammetric computer vision framework for 3D reconstruction and camera tracking from photographs and videos.
A PyTorch framework for deep learning research and development, focusing on reproducibility and rapid experimentation.
An open-source framework for building multimodal AI systems that enable large language models to understand and chat about videos and images.
A ROS2 wrapper for Intel RealSense cameras that provides depth, color, and IMU data as ROS topics and services.
A collection of autonomous driving datasets and evaluation code for advancing machine perception and self-driving research.
PyTorch implementation of FlowNet 2.0 for optical flow estimation using deep neural networks.
A hybrid Python/C++ Visual SLAM pipeline supporting monocular, stereo, and RGB-D cameras with modern features, loop closure, and dense reconstruction.
A TensorFlow implementation of neural style transfer for images and videos, blending content and artistic styles using convolutional neural networks.
A curated list of awesome open-source OCR software, libraries, datasets, and literature.
A multi-sensor calibration toolbox for autonomous driving, supporting IMU, LiDAR, camera, and radar calibration.
A simple and versatile framework for object detection and instance recognition with extensive model coverage and distributed training.
A curated list of deep learning image classification papers and their code implementations since 2014.
A versatile visual SLAM framework for monocular, stereo, and RGB-D cameras with map storage and reuse capabilities.
A blazing-fast SIMD-optimized image comparison library with Node.js API for visual regression testing.
A JavaScript library for real-time hand detection and pose classification directly in the browser using TensorFlow.js.
A curated list of research papers, datasets, and resources for anomaly detection in time-series, video, and image data.
A curated list of robotics libraries, simulators, and software for developers and researchers.
A curated list of academic papers, datasets, and resources for image and video deblurring research.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.