A Python devkit for loading, exploring, and manipulating the PandaSet, a large-scale autonomous driving dataset with LiDAR, camera, and annotations.
pandaset-devkit is a Python library that provides tools to work with the PandaSet, a large-scale autonomous driving dataset. It simplifies loading, accessing, and manipulating LiDAR point clouds, camera images, GPS data, and annotations like 3D cuboids and semantic segmentation labels. The toolkit abstracts away the raw dataset structure, allowing researchers to focus on analysis and model development.
Autonomous driving researchers, computer vision engineers, and data scientists working with LiDAR and camera datasets for perception tasks like object detection and semantic segmentation.
It offers a clean, pandas-centric API that integrates seamlessly with the Python data science ecosystem, making it easier to explore and preprocess complex multi-sensor data compared to handling raw files directly.
The pandaset-devkit is a Python toolkit designed to work with the PandaSet, a public large-scale dataset for autonomous driving research. It provides a clean, intuitive API for data scientists and engineers to efficiently load sensor data, access annotations, and perform analysis without dealing with raw file complexities.
load_lidar(), load_cuboids()) to optimize memory usage.The devkit is built to simplify interaction with complex autonomous driving datasets, prioritizing ease of use, flexibility, and integration with popular Python data science libraries like pandas and NumPy.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
The DataSet class automatically discovers sequences and provides methods like load() and load_lidar(), abstracting away raw file complexities for efficient access.
LiDAR point clouds and cuboid annotations are stored as pandas DataFrames, enabling seamless manipulation and analysis with popular Python data science libraries.
Supports slicing and selective loading (e.g., set_sensor() for LiDAR), allowing optimized memory usage and tailored data retrieval for specific research needs.
Includes 42-class semantic segmentation and 3D cuboid annotations accessible via a consistent API, facilitating advanced perception tasks like object detection.
The toolkit is exclusively designed for the PandaSet format, making it unsuitable for other autonomous driving datasets without significant modification or workarounds.
Loading full sequences into memory, especially large LiDAR and camera data, can be resource-heavy, potentially straining standard workstations or cloud instances.
Semantic segmentation annotations are not available for all scenes, requiring users to filter sequences manually, which adds extra steps and complexity.