Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Tags
  3. Scikit Learn

Scikit Learn

86 projects

Showing 36 of 86 projects

Auto ML
Auto MLPython

Automated machine learning library for production and analytics, handling feature engineering, model selection, and hyperparameter optimization.

#hyperparameter-optimization#machine-learning-library#data-science
Stars1.7k
Forks309
Last commit
hyperopt-sklearn
hyperopt-sklearnPython

Hyperopt-sklearn automates hyperparameter optimization and model selection for scikit-learn machine learning pipelines.

#hyperparameter-optimization#data-science#bayesian-optimization
Stars1.6k
Forks274
Last commit
boruta_py
boruta_pyPython

Python implementation of the Boruta all-relevant feature selection method with scikit-learn compatibility.

#statistical-analysis#random-forest#ensemble-methods
Stars1.6k
Forks266
Last commit6 months ago
imodels
imodelsJupyter Notebook

A Python package for concise, transparent, and accurate predictive modeling with sklearn-compatible interpretable models.

#ai#rule-based-models#data-science
Stars1.6k
Forks137
Last commit11 days ago
scikit-feature
scikit-featurePython

An open-source Python repository providing around 40 feature selection algorithms for machine learning applications.

#feature-selection#scipy#data-science
Stars1.6k
Forks441
Last commit1 year ago
MLeap
MLeapScala

MLeap is a portable execution engine for deploying machine learning pipelines from Spark and Scikit-learn without their runtime dependencies.

#apache-spark#spark#production-ml
Stars1.5k
Forks317
Last commit2 months ago
skforecast
skforecastPython

A Python library for time series forecasting using scikit-learn compatible machine learning models.

#data-science#time-series-forecasting#catboost
Stars1.5k
Forks190
Last commit1 day ago
skforecast
skforecastPython

A Python library for time series forecasting using scikit-learn compatible machine learning models.

#data-science#lightgbm#python
Stars1.5k
Forks190
Last commit1 day ago
Intel(R) Extension for Scikit-learn
Intel(R) Extension for Scikit-learnPython

A free software AI accelerator that speeds up scikit-learn applications by 10-100x on CPUs and GPUs with no code changes.

#oneapi#ai-machine-learning#ai-accelerator
Stars1.3k
Forks187
Last commit
Hyperparameter-Optimization-of-Machine-Learning-Algorithms
Hyperparameter-Optimization-of-Machine-Learning-AlgorithmsJupyter Notebook

Implementation of hyperparameter optimization methods for ML/DL models with sample code for regression and classification tasks.

#random-search#hyperparameter-optimization#hyperparameter-tuning
Stars1.3k
Forks304
Last commit
sklearn-porter
sklearn-porterPython

Transpile trained scikit-learn estimators to C, Java, JavaScript, Go, PHP, and Ruby for embedded systems and performance-critical applications.

#deployment#embedded-systems#sklearn
Stars1.3k
Forks169
Last commit2 years ago
Xcessiv
XcessivPython

A web-based tool for automated hyperparameter tuning and stacked ensemble creation in Python.

#ensemble-learning#hyperparameter-optimization#hyperparameter-tuning
Stars1.3k
Forks106
Last commit8 years ago
fastFM
fastFMPython

A Python library implementing Factorization Machines with a scikit-learn compatible API for regression, classification, and ranking tasks.

#recommender-system#python-library#sparse-data
Stars1.1k
Forks205
Last commit3 years ago
scikit-multilearn
scikit-multilearnPython

A scikit-learn compatible Python module for multi-label classification tasks.

#scikit#scipy#data-science
Stars953
Forks175
Last commit2 years ago
mlens
mlensPython

A Python library for building high-performance, memory-efficient ensemble learning networks with a Scikit-learn compatible API.

#ensemble-learning#parallel-computing#ensemble
Stars866
Forks110
Last commit2 years ago
scikit-multiflow
scikit-multiflowPython

A Python machine learning package for incremental learning on streaming data with concept drift detection.

#scikit#streaming-data#adaptive-learning
Stars795
Forks188
Last commit2 years ago
data_hacking
data_hackingJupyter Notebook

A collection of IPython notebooks demonstrating data analysis and machine learning techniques on security datasets.

#security-analytics#educational#python
Stars783
Forks300
Last commit7 years ago
sklearn-deap
sklearn-deapJupyter Notebook

A scikit-learn compatible hyperparameter optimization tool using evolutionary algorithms instead of grid search.

#genetic-algorithms#hyperparameter-optimization#evolutionary-algorithms
Stars774
Forks129
Last commit
ThunderGBM
ThunderGBMC++

A fast GPU-accelerated library for training Gradient Boosting Decision Trees (GBDT) and Random Forests.

#cuda#random-forest#high-performance-computing
Stars713
Forks87
Last commit1 year ago
Reproducible Experiment Platform (REP)
Reproducible Experiment Platform (REP)Jupyter Notebook

IPython-based environment for reproducible machine learning research with unified wrappers for multiple ML libraries.

#parallel-computing#data-science#experiment-tracking
Stars700
Forks148
Last commit
vecstack
vecstackPython

A Python package for stacking (stacked generalization) with both functional and scikit-learn compatible APIs.

#ensemble-learning#blending#stacked-generalization
Stars699
Forks81
Last commit7 months ago
profanity-check
profanity-checkPython

A fast, robust Python library to detect offensive language in text using a machine learning model.

#profanity-filter#sklearn#python3
Stars653
Forks123
Last commit1 year ago
Dora
DoraPython

A Python library that automates the tedious parts of exploratory data analysis with cleaning, feature engineering, visualization, and versioning.

#data-cleaning#data-versioning#python
Stars649
Forks75
Last commit10 months ago
Intel® oneAPI Data Analytics Library
Intel® oneAPI Data Analytics LibraryC++

A high-performance C++/DPC++ library for accelerated machine learning on CPUs, GPUs, and distributed systems.

#oneapi#hacktoberfest#ai-machine-learning
Stars648
Forks225
Last commit1 day ago
Data science your way
Data science your wayJupyter Notebook

A tutorial series comparing how to implement data science concepts and build applications in both Python and R ecosystems.

#notebook#educational#data-science
Stars616
Forks253
Last commit5 years ago
ipython-notebooks
ipython-notebooksJupyter Notebook

A collection of IPython notebooks containing machine learning experiments and examples using scikit-learn and related Python libraries.

#data-science#jupyter#python
Stars575
Forks198
Last commit1 month ago
SOMPY
SOMPYJupyter Notebook

A Python library implementing Self-Organizing Maps (SOM) with batch training, PCA initialization, and visualization tools.

#self-organizing-maps#python-library#matplotlib
Stars552
Forks248
Last commit3 years ago
Auto_ViML
Auto_ViMLPython

Automatically builds high-performance interpretable machine learning models with minimal features using a single line of code.

#data-cleaning#imbalanced-data#feature-selection
Stars547
Forks104
Last commit1 year ago
Understanding random forests: from theory to practice
Understanding random forests: from theory to practiceTeX

A comprehensive PhD dissertation providing an in-depth theoretical and practical analysis of random forests, from algorithmic foundations to interpretability.

#ensemble-methods#random-forests#algorithm-analysis
Stars529
Forks154
Last commit
sklearn-bayes
sklearn-bayesJupyter Notebook

A Python package providing Bayesian machine learning algorithms with a scikit-learn compatible API.

#probabilistic-modeling#variational-inference#mixture-models
Stars524
Forks118
Last commit4 years ago
sklearn-expertsys
sklearn-expertsysPython

A scikit-learn compatible classifier that produces human-interpretable decision rules instead of black box models.

#model-transparency#classification#bayesian-rule-lists
Stars490
Forks71
Last commit8 years ago
CellTypist
CellTypistPython

An automated cell type annotation tool for single-cell RNA-seq data using logistic regression classifiers.

#scrna-seq#biomedical-data#label-transfer
Stars487
Forks59
Last commit15 days ago
leaves
leavesGo

A pure Go library for making predictions with Gradient Boosting Regression Trees models from LightGBM, XGBoost, and scikit-learn.

#gbdt#data-science#go-library
Stars479
Forks87
Last commit1 year ago
RuleFit
RuleFitPython

Python implementation of the RuleFit algorithm for interpretable machine learning predictions using rule ensembles.

#ensemble-methods#rule-based-models#python
Stars446
Forks120
Last commit2 years ago
imbalanced-ensemble
imbalanced-ensemblePython

A Python library for class-imbalanced ensemble learning with 30+ algorithms, built on scikit-learn.

#ensemble-learning#imbalanced-data#imbalanced-learning
Stars426
Forks60
Last commit3 months ago
scikit-rebate
scikit-rebatePython

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for machine learning.

#relief-algorithms#feature-selection#data-science
Stars421
Forks72
Last commit3 years ago
PreviousPage 2 of 3

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub
5 years ago
1 year ago
1 day ago
3 years ago
2 years ago
1 year ago
10 years ago
Next
#Machine Learning83
#Python67
#Data Science48
#Python Library18
#Deep Learning12
#Xgboost12
#Automl11
#Hyperparameter Optimization11
#Pandas10
#Jupyter Notebooks9
#Automated Machine Learning9
#Bayesian Optimization8