Showing 36 of 575 projects
A curated list of recent research papers and resources on Vision and Language Pre-trained Models (VL-PTMs).
An open-source library providing chest X-ray datasets, pre-trained models, and tools for medical imaging research and analysis.
Automated, hardware-independent hand-eye calibration for ROS1 with GUI support for robot vision tasks.
A cross-platform library for USB video devices built on libusb, providing fine-grained control over UVC-compliant hardware.
A lean and fast C++ library for 3D point cloud data processing with efficient implementations of common operations.
A PyTorch implementation for super fast and accurate 3D object detection using LiDAR point clouds, featuring an anchor-free approach.
An open-source application for automated biological image analysis, enabling biologists to measure phenotypes from thousands of images.
A 3D segment-based mapping library for robot localization, environment reconstruction, and semantics extraction using LiDAR data.
An open-source tool for fast and accurate Optical Mark Recognition (OMR) from scanned documents or mobile photos.
A library of modular computer vision components built on Keras 3, supporting TensorFlow, JAX, and PyTorch backends.
A TensorFlow-based GAN model that upscales images by 4x while generating photo-realistic details.
A collection of pure-Rust computer vision libraries providing algorithms for photogrammetry, image processing, and pattern recognition.
A framework for semantic and instance segmentation of LiDAR point clouds using range images, designed for autonomous driving applications.
An efficient LiDAR-based semantic SLAM system that builds 3D semantic maps from laser scans.
A PyTorch framework for efficient 3D semantic and panoptic segmentation using superpoint-based transformer architectures.
A CVPR 2018 algorithm for efficient multi-person pose estimation and tracking in videos, ranking first in the ICCV 2017 PoseTrack challenge.
High-level TensorFlow network definitions with pre-trained weights for easy integration into existing ML workflows.
A PyTorch-based framework for training and validating models that produce high-quality embeddings for metric learning and retrieval tasks.
Deep learning inference nodes for ROS/ROS2 with support for NVIDIA Jetson devices and TensorRT.
A lightweight TensorFlow library for training and evaluating Generative Adversarial Networks (GANs).
A Rust image processing library for computer vision and graphics applications, built on the image crate.
A ROS package collection for processing raw camera images into calibrated, rectified formats for computer vision applications.
A virtual environment simulator for training embodied AI agents with real-world perception and physics, featuring domain transfer to real robots.
Official JAX implementation of Mip-NeRF, a multiscale neural radiance field model for anti-aliased novel view synthesis.
A video stabilization library that plugs into FFmpeg and Transcode to smooth shaky footage from handheld or vehicle-mounted cameras.
A curated list of open-source and commercial tools for labeling and managing datasets across images, audio, time series, and text.
A Python API for the Argoverse dataset, providing tools for 3D tracking, motion forecasting, and HD map interaction for autonomous vehicle research.
An open-source simulator for experimenting with and advancing self-driving AI, accessible to anyone with a PC.
A curated list of awesome tutorials, blogs, and projects for the CARLA autonomous driving simulator.
A curated list of face-related algorithms, datasets, papers, and open-source libraries for computer vision research.
A PyTorch implementation of Social GAN for predicting socially acceptable human trajectories using generative adversarial networks.
TensorFlow implementation of an attention-based neural image caption generator that focuses on relevant image parts while generating words.
A Ruby binding for the libvips image processing library, offering fast, memory-efficient image operations.
A modular ROS package for 3D/6D robot localization and point cloud registration using PCL, with dynamic map updates via OctoMap.
An open-source visual-inertial odometry system that estimates camera motion and sparse 3D maps from camera and IMU data.
Monocular 3D object detection and SLAM system that detects and tracks cuboids to estimate camera and object poses.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.