How to use semantic_slam with a different RGB-D camera?

You need to modify the camera launch files and intrinsic parameters in semantic_cloud.yaml to match your camera's specifications, but the project primarily supports Asus Xtion and may require additional ROS driver integration.

Semantic_slam vs ORB-SLAM3 for semantic mapping?

Semantic_slam builds on ORB_SLAM2 for SLAM and adds semantic segmentation via PSPNet, while ORB-SLAM3 is a newer SLAM backend without built-in semantics; Semantic_slam is better for object-level mapping but uses older SLAM technology.

How to train PSPNet on a custom dataset for semantic_slam?

Use the pytorch-semseg library referenced in the acknowledgements to train a model on your dataset, then update the model_path and dataset parameters in semantic_cloud.yaml, though the README lacks detailed training instructions.

Can semantic_slam run without a GPU?

No, the PSPNet semantic segmentation relies on PyTorch and GPU acceleration for real-time performance; running on CPU would significantly slow down the 2-3 Hz segmentation rate, making it impractical for interactive use.

How to save and load octomaps in semantic_slam?

The octomap_generator node has a save_path parameter, but it's marked as 'not tested' in the configuration section, indicating limited functionality and potential bugs in map persistence.

What's the accuracy of semantic segmentation in semantic_slam?

Accuracy depends on the pre-trained PSPNet models provided for ADE20K or SUNRGBD datasets; for custom environments, you may need fine-tuning, and the project report discusses fusion methods but not benchmark metrics.

Open-Awesome

semantic_slam

GPL-3.0C++

Real-time 3D semantic mapping system using a handheld RGB-D camera, built on ROS with ORB_SLAM2 and PSPNet.

GitHub

716 stars175 forks0 contributors

What is semantic_slam?

Semantic SLAM is a real-time 3D mapping system that builds semantically annotated maps from a handheld RGB-D camera. It solves the problem of environment understanding for robotics by combining simultaneous localization and mapping (SLAM) with deep learning-based semantic segmentation to label objects (e.g., walls, furniture) in the generated 3D map.

Target Audience

Robotics researchers, computer vision engineers, and developers working on autonomous navigation, augmented reality, or environmental modeling who need real-time semantic 3D perception.

Value Proposition

It offers an open-source, integrated pipeline for semantic SLAM using commodity hardware, with configurable output (semantic or RGB maps) and multiple fusion methods, built on robust ROS and proven libraries like ORB_SLAM2 and OctoMap.

Overview

Real time semantic slam in ROS with a hand held RGB-D camera

Use Cases

Best For

Robotic navigation in dynamic indoor environments
Building 3D semantic maps for augmented reality applications
Research on real-time semantic SLAM algorithms
Environmental modeling with object-level understanding
Integrating deep learning semantic segmentation with SLAM systems
Educational projects in robotic perception and computer vision

Not Ideal For

Applications requiring semantic updates faster than 2-3 Hz for dynamic environments
Projects not using ROS or those with different middleware frameworks
Systems lacking GPU hardware for efficient deep learning inference
Teams needing support for modern RGB-D cameras beyond Asus Xtion or ROS bags

Pros & Cons

Pros

Real-Time Semantic Mapping

Integrates ORB_SLAM2, depth data, and PSPNet segmentation to generate 3D semantic octomaps at interactive rates, as demonstrated in the run-time analysis with 1 Hz map updates.

Configurable Output Formats

Allows switching between semantic octomaps with object labels and standard RGB octomaps via parameter settings, offering flexibility for different perception tasks.

Multiple Fusion Methods

Supports max-confidence or Bayesian fusion for semantic label integration, providing options for robustness in label assignment, as detailed in the project report.

ROS Integration

Implemented as ROS nodes with launch files, compatible with common RGB-D cameras like Asus Xtion and ROS bags, simplifying deployment in robotic systems.

Cons

Outdated Dependencies

Relies on PyTorch 0.4.0 and ORB_SLAM2, which are older versions that may cause compatibility issues with modern systems and lack ongoing updates.

Complex Installation Process

Requires manual building of ORB_SLAM2, handling multiple dependencies, and configuring parameters in YAML files, making setup non-trivial and error-prone.

Slow Semantic Update Rate

Semantic segmentation runs at only 2-3 Hz, as noted in the run-time section, which could bottleneck real-time applications in fast-moving scenarios.

Limited Sensor Support

Primarily tested with Asus Xtion camera and specific GPU hardware, with no clear documentation for adapting to other RGB-D sensors or edge devices.

Frequently Asked Questions

Related Projects

detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Stars34,442

Forks7,932

Last commit26 days ago

EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Stars29,389

Forks3,567

Last commit5 months ago

imgaug

Image augmentation for machine learning experiments.

Stars14,737

Forks2,455

Last commit1 year ago

meshroom

Node-based Visual Programming Toolbox

Stars12,703

Forks1,206

Last commit4 days ago

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub