Winning solution for the Galaxy Challenge on Kaggle, using convolutional neural networks to classify galaxy morphologies.
kaggle-galaxies is the winning solution for the Galaxy Challenge on Kaggle, a competition focused on classifying galaxy morphologies from astronomical images. It implements an ensemble of convolutional neural networks trained with extensive data augmentation techniques to achieve high accuracy in galaxy classification. The solution addresses the challenge of automated galaxy morphology classification using deep learning approaches.
Data scientists and machine learning practitioners interested in astronomical image classification, Kaggle competition participants studying winning solutions, and researchers working on computer vision applications in astronomy.
Provides a complete, reproducible implementation of a competition-winning deep learning pipeline specifically optimized for galaxy morphology classification. The ensemble approach with data augmentation demonstrates state-of-the-art techniques for improving model robustness and accuracy in image classification tasks.
Winning solution for the Galaxy Challenge on Kaggle (http://www.kaggle.com/c/galaxy-zoo-the-galaxy-challenge)
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Blends predictions from multiple CNN models to achieve competition-winning scores, as demonstrated by the #1 Kaggle ranking without averaging.
Uses rotations and flips to increase training diversity, improving generalization without additional data, as outlined in the augmentation scripts.
Leverages Theano for GPU acceleration with specific tuning tips like disabling garbage collection, optimizing for hardware like GeForce GTX 680.
Incorporates SExtractor parameters as extra inputs, enhancing model robustness beyond raw image data, as shown in the parameter extraction scripts.
Relies on Theano and pylearn2, which are deprecated and lack modern support, making updates and compatibility with new GPUs challenging.
Requires multiple steps including dependency installation, data copying to RAM, and running numerous scripts without automation, increasing setup time and error risk.
Acknowledges randomness in weight initialization and training, leading to score variations without reproducibility guarantees, as noted in the submission instructions.