A Python library for image augmentation in machine learning, offering a stochastic pipeline approach with fine-grained control over operations.
Augmentor is a Python library for image augmentation specifically designed for machine learning applications. It automates the generation of augmented images to expand datasets, using a stochastic pipeline approach where users define sequences of operations like rotations, distortions, and flips. The library aims to provide fine-grained control over augmentation techniques while remaining platform and framework independent.
Machine learning practitioners, data scientists, and researchers working on computer vision projects who need to augment image datasets for training neural networks or deep learning models. It's particularly useful for those using Keras or PyTorch who require customizable, real-world relevant augmentation pipelines.
Developers choose Augmentor for its standalone, framework-agnostic design that offers more convenience and finer control compared to built-in augmentation tools. Its pipeline-based stochastic approach, support for ground truth data, and advanced transformations like elastic distortions make it a versatile choice for complex augmentation needs.
Image augmentation library in Python for machine learning.
Allows building custom sequences of stochastic operations with fine-grained control, as shown in the README where users chain distortions, flips, and crops to generate diverse augmented images.
Supports parallel augmentation of images and their masks using the ground_truth() function, ensuring identical transformations for segmentation tasks, demonstrated with side-by-side examples in the documentation.
Provides generators for Keras and PyTorch via keras_generator() and torch_transform(), enabling on-the-fly augmentation during training without saving to disk, as outlined in the integration notebooks.
Includes unique operations like elastic distortions and perspective transforms that preserve image size without black padding, highlighted in the README with comparative visuals against other methods.
Lacks support for 3D image augmentation, which is a critical gap for fields like medical imaging where volumetric data (e.g., DICOM files) is common, and the README does not mention any plans for 3D features.
Requires manual addition of each operation to the pipeline in code, which can be more cumbersome and error-prone compared to declarative configuration systems used in libraries like Albumentations.
The README admits that multi-threading can slow down pipelines with very small images, forcing users to disable it via multi_threaded=False, indicating a performance trade-off that requires manual tuning.
Open Source Computer Vision Library
Datasets, Transforms and Models specific to Computer Vision
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Image augmentation for machine learning experiments.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.