Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Stacks
  3. scikit-learn
S

scikit-learn

Framework
123 projects403.4k total stars103.1k total forks7 languages

Open-source projects built with scikit-learn

There are currently 123 open-source projects built with scikit-learn, with a combined total of 403.4k GitHub stars. The most common language among these projects is Python.

Showing 122 open-source projects · page 2 of 4

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub
hyperopt-sklearn
hyperopt-sklearnhyperopt/hyperopt-sklearn

Hyperopt-sklearn automates hyperparameter optimization and model selection for scikit-learn machine learning pipelines.

1.6k274Python
1 year ago
boruta_py
boruta_pyscikit-learn-contrib/boruta_py

Python implementation of the Boruta all-relevant feature selection method with scikit-learn compatibility.

1.6k266Python
6 months ago
imodels
imodelscsinva/imodels

A Python package for concise, transparent, and accurate predictive modeling with sklearn-compatible interpretable models.

1.6k137Jupyter Notebook
13 days ago
scikit-feature
scikit-featurejundongl/scikit-feature

An open-source Python repository providing around 40 feature selection algorithms for machine learning applications.

1.6k441Python
1 year ago
MLeap
MLeapcombust/mleap

MLeap is a portable execution engine for deploying machine learning pipelines from Spark and Scikit-learn without their runtime dependencies.

1.5k317Scala
3 months ago
Intel(R) Extension for Scikit-learn
Intel(R) Extension for Scikit-learnintel/scikit-learn-intelex

A free software AI accelerator that speeds up scikit-learn applications by 10-100x on CPUs and GPUs with no code changes.

1.3k187Python
22 hours ago
Hyperparameter-Optimization-of-Machine-Learning-Algorithms
Hyperparameter-Optimization-of-Machine-Learning-AlgorithmsLiYangHart/Hyperparameter-Optimization-of-Machine-Learning-Algorithms

Implementation of hyperparameter optimization methods for ML/DL models with sample code for regression and classification tasks.

1.3k304Jupyter Notebook
3 years ago
sklearn-porter
sklearn-porternok/sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript, Go, PHP, and Ruby for embedded systems and performance-critical applications.

1.3k169Python
2 years ago
Xcessiv
Xcessivreiinakano/xcessiv

A web-based tool for automated hyperparameter tuning and stacked ensemble creation in Python.

1.3k106Python
8 years ago
mlforecast
mlforecastNixtla/mlforecast

A Python framework for scalable time series forecasting using machine learning models, designed for production environments.

1.2k125Python
3 days ago
Ember
Emberelastic/ember

An open dataset and toolkit for training static PE malware machine learning models, featuring millions of labeled Windows executable samples.

1.2k311Jupyter Notebook
1 year ago
Ember
Emberendgameinc/ember

An open dataset and toolkit for training static PE malware machine learning models, featuring extracted features from millions of Windows executable files.

1.2k311Jupyter Notebook
1 year ago
fastFM
fastFMibayer/fastFM

A Python library implementing Factorization Machines with a scikit-learn compatible API for regression, classification, and ranking tasks.

1.1k205Python
3 years ago
Predict time series
Predict time seriesguillaume-chevalier/seq2seq-signal-prediction

A TensorFlow-based educational project for learning seq2seq RNNs through signal forecasting exercises.

1.1k289Jupyter Notebook
3 years ago
scikit-multilearn
scikit-multilearnscikit-multilearn/scikit-multilearn

A scikit-learn compatible Python module for multi-label classification tasks.

953175Python
2 years ago
pyFM
pyFMcoreylynch/pyFM

A Python implementation of Factorization Machines for recommendation and classification tasks using stochastic gradient descent with adaptive regularization.

925304Python
5 years ago
A chess AI that learns to play chess using deep learning.
A chess AI that learns to play chess using deep learning.erikbern/deep-pink

A chess AI that learns to play chess using deep learning and neural networks.

831158Python
9 years ago
zheye
zheyemuchrooms/zheye

A convolutional neural network program that identifies inverted Chinese character captchas used by Zhihu for login verification.

794221Python
2 years ago
data_hacking
data_hackingClickSecurity/data_hacking

A collection of IPython notebooks demonstrating data analysis and machine learning techniques on security datasets.

783300Jupyter Notebook
7 years ago
tffm
tffmgeffy/tffm

TensorFlow implementation of arbitrary order (≥2) Factorization Machines for classification and regression tasks.

778173Jupyter Notebook
4 years ago
MIDAS: Detecting Microcluster Anomalies in Edge Streams
MIDAS: Detecting Microcluster Anomalies in Edge Streamsbhatiasiddharth/MIDAS

A real-time anomaly detection algorithm for dynamic graph streams, identifying intrusions, fraud, and fake ratings with constant memory and update time.

77598C++
2 years ago
sklearn-deap
sklearn-deaprsteca/sklearn-deap

A scikit-learn compatible hyperparameter optimization tool using evolutionary algorithms instead of grid search.

774129Jupyter Notebook
2 years ago
CHIEF
CHIEFhms-dbmi/CHIEF

A general-purpose foundation model for cancer diagnosis and prognosis prediction from histopathology whole-slide images.

712115Python
5 months ago
chainer-chemistry
chainer-chemistrypfnet-research/chainer-chemistry

A deep learning library built on Chainer for molecular property prediction using graph convolutional neural networks.

700132Python
3 years ago
Reproducible Experiment Platform (REP)
Reproducible Experiment Platform (REP)yandex/rep

IPython-based environment for reproducible machine learning research with unified wrappers for multiple ML libraries.

700148Jupyter Notebook
1 year ago
vecstack
vecstackvecxoz/vecstack

A Python package for stacking (stacked generalization) with both functional and scikit-learn compatible APIs.

69981Python
7 months ago
profanity-check
profanity-checkvzhou842/profanity-check

A fast, robust Python library to detect offensive language in text using a machine learning model.

653123Python
1 year ago
Gym-Malware
Gym-Malwareendgameinc/gym-malware

A reinforcement learning environment for training AI agents to manipulate malware samples and evade static machine learning detection.

636166Python
3 years ago
AutoML-Implementation-for-Static-and-Dynamic-Data-Analytics
AutoML-Implementation-for-Static-and-Dynamic-Data-AnalyticsWestern-OC2-Lab/AutoML-Implementation-for-Static-and-Dynamic-Data-Analytics

An AutoML implementation and tutorial for automating machine learning pipelines on both static datasets and dynamic data streams, with a focus on IoT anomaly detection.

627110Jupyter Notebook
2 years ago
Data science your way
Data science your wayjadianes/data-science-your-way

A tutorial series comparing how to implement data science concepts and build applications in both Python and R ecosystems.

616253Jupyter Notebook
5 years ago
MusicGenreClassification
MusicGenreClassificationmlachmish/MusicGenreClassification

Classify music genre from a 10-second audio stream using a convolutional neural network trained on mel-frequency spectrograms.

600121Python
6 years ago
ipython-notebooks
ipython-notebooksogrisel/notebooks

A collection of IPython notebooks containing machine learning experiments and examples using scikit-learn and related Python libraries.

575198Jupyter Notebook
1 month ago
TabGAN
TabGANDiyago/Tabular-data-generation

A Python library for generating high-quality synthetic tabular data using GANs, diffusion models, and large language models.

57083Python
2 months ago
SOMPY
SOMPYsevamoo/SOMPY

A Python library implementing Self-Organizing Maps (SOM) with batch training, PCA initialization, and visualization tools.

552248Jupyter Notebook
3 years ago
Auto_ViML
Auto_ViMLAutoViML/Auto_ViML

Automatically builds high-performance interpretable machine learning models with minimal features using a single line of code.

547104Python
1 year ago
SUAVE
SUAVEsuavecode/SUAVE

A multi-fidelity conceptual design environment for modeling future aircraft with advanced technologies.

515450ReScript
2 years ago
1
2
3
4