A hybrid Python/C++ Visual SLAM pipeline supporting monocular, stereo, and RGB-D cameras with modern features, loop closure, and dense reconstruction.
pySLAM is an open-source Visual SLAM framework that enables real-time camera tracking and 3D map building from monocular, stereo, or RGB-D video streams. It solves the problem of estimating a device's position while reconstructing the surrounding environment, which is essential for autonomous robots, AR/VR, and 3D scanning applications.
Researchers and developers in robotics, computer vision, and augmented reality who need a flexible, extensible platform for prototyping and evaluating Visual SLAM algorithms.
pySLAM stands out by offering a hybrid Python/C++ architecture, a vast library of modern features and models, and integrated modules for dense reconstruction, depth prediction, and semantic mapping—all within a single, cohesive environment.
pySLAM is a hybrid Python/C++ Visual SLAM pipeline supporting monocular, stereo, and RGB-D cameras. It provides a broad set of modern local and global feature extractors, multiple loop-closure strategies, a volumetric reconstruction module, integrated depth-prediction models, and semantic segmentation capabilities for enhanced scene understanding.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Integrates over 20 local feature detectors and descriptors, from classical ORB/SIFT to modern SuperPoint and ALIKED, enabling flexible experimentation in feature matching and tracking.
Provides end-to-end modules for visual odometry, full SLAM with loop closing, dense volumetric reconstruction (TSDF, Gaussian splatting), depth prediction, and semantic mapping—all in a single Python environment.
Offers interchangeable Python and C++ cores for balancing development flexibility with runtime performance, with cross-compatible map saving and loading between implementations.
Includes state-of-the-art techniques like incremental Gaussian splatting, NetVLAD for loop detection, and models like DUSt3R for multi-view scene inference, keeping pace with academic advances.
The unified install script requires managing numerous dependencies (CUDA, PyTorch, OpenCV non-free modules) and has a dedicated troubleshooting file, indicating frequent setup challenges.
When using the default Python core, real-time performance is not guaranteed, especially with compute-heavy modules like depth prediction (e.g., DepthPro takes ~1s per image), necessitating the C++ core for speed.
With multiple configuration files (config.yaml, config_parameters.py) and hundreds of parameters across features, loop closing, and volumetric integration, tuning the system is non-trivial and error-prone.
Some modules, like front-end depth prediction, are labeled as 'experimental' or 'WIP' in the README, leading to potential bugs or incomplete functionality in production use.