Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Machine Learning
  3. Kaggle Galaxy Challenge

Kaggle Galaxy Challenge

BSD-3-ClausePython

Winning solution for the Galaxy Challenge on Kaggle, using convolutional neural networks to classify galaxy morphologies.

GitHubGitHub
500 stars184 forks0 contributors

What is Kaggle Galaxy Challenge?

kaggle-galaxies is the winning solution for the Galaxy Challenge on Kaggle, a competition focused on classifying galaxy morphologies from astronomical images. It implements an ensemble of convolutional neural networks trained with extensive data augmentation techniques to achieve high accuracy in galaxy classification. The solution addresses the challenge of automated galaxy morphology classification using deep learning approaches.

Target Audience

Data scientists and machine learning practitioners interested in astronomical image classification, Kaggle competition participants studying winning solutions, and researchers working on computer vision applications in astronomy.

Value Proposition

Provides a complete, reproducible implementation of a competition-winning deep learning pipeline specifically optimized for galaxy morphology classification. The ensemble approach with data augmentation demonstrates state-of-the-art techniques for improving model robustness and accuracy in image classification tasks.

Overview

Winning solution for the Galaxy Challenge on Kaggle (http://www.kaggle.com/c/galaxy-zoo-the-galaxy-challenge)

Use Cases

Best For

  • Studying winning Kaggle competition solutions for image classification
  • Implementing convolutional neural networks for astronomical image analysis
  • Learning about data augmentation techniques for improving model generalization
  • Understanding ensemble methods for boosting prediction accuracy
  • Exploring Theano-based deep learning implementations
  • Reproducing research-grade galaxy morphology classification pipelines

Not Ideal For

  • Projects requiring modern deep learning frameworks like TensorFlow or PyTorch
  • Teams with limited GPU resources or needing quick, automated deployment
  • Applications demanding real-time inference or easy reproducibility without manual setup

Pros & Cons

Pros

Ensemble Accuracy Boost

Blends predictions from multiple CNN models to achieve competition-winning scores, as demonstrated by the #1 Kaggle ranking without averaging.

Extensive Data Augmentation

Uses rotations and flips to increase training diversity, improving generalization without additional data, as outlined in the augmentation scripts.

GPU-Optimized Performance

Leverages Theano for GPU acceleration with specific tuning tips like disabling garbage collection, optimizing for hardware like GeForce GTX 680.

Integrated Feature Extraction

Incorporates SExtractor parameters as extra inputs, enhancing model robustness beyond raw image data, as shown in the parameter extraction scripts.

Cons

Outdated Framework Dependencies

Relies on Theano and pylearn2, which are deprecated and lack modern support, making updates and compatibility with new GPUs challenging.

Complex Manual Setup

Requires multiple steps including dependency installation, data copying to RAM, and running numerous scripts without automation, increasing setup time and error risk.

Non-Deterministic Results

Acknowledges randomness in weight initialization and training, leading to score variations without reproducibility guarantees, as noted in the submission instructions.

Frequently Asked Questions

Quick Stats

Stars500
Forks184
Contributors0
Open Issues0
Last commit12 years ago
CreatedSince 2014

Tags

#kaggle-competition#astronomy#data-augmentation#theano#image-classification#computer-vision#convolutional-neural-networks#machine-learning

Built With

T
Theano
s
scikit-image
p
pandas
P
Python
N
NumPy
S
SciPy

Included in

Machine Learning72.2k
Auto-fetched 1 day ago

Related Projects

open-solution-home-creditopen-solution-home-credit

Open solution to the Home Credit Default Risk challenge :house_with_garden:

Stars464
Forks171
Last commit4 years ago
open-solution-data-science-bowl-2018open-solution-data-science-bowl-2018

Open solution to the Data Science Bowl 2018

Stars155
Forks42
Last commit4 years ago
open-solution-toxic-commentsopen-solution-toxic-comments

Open solution to the Toxic Comment Classification Challenge

Stars155
Forks55
Last commit4 years ago
open-solution-salt-identificationopen-solution-salt-identification

Open solution to the TGS Salt Identification Challenge

Stars121
Forks44
Last commit5 years ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub