Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Tags
  3. Data Science

Data Science

506 projects

Showing 36 of 506 projects

machine learning
machine learningJavaScript

A web interface and REST API for classification and regression using Support Vector Machine (SVM) and Support Vector Regression (SVR) algorithms.

#flask#data-science#rest-api
Stars258
Forks83
Last commit5 years ago
PyODDS
PyODDSPython

An end-to-end Python outlier detection system with database support, automated machine learning, and unified APIs for statistical, ML, and deep learning models.

#statistical-analysis#database#python-library
Stars255
Forks39
Last commit3 years ago
ferry
ferryPython

Define, run, and deploy big data applications on AWS, OpenStack, and local machines using Docker.

#devops#spark#data-science
Stars254
Forks25
Last commit11 years ago
elastic
elasticR

R client for the Elasticsearch HTTP API, enabling data indexing, search, and analysis from R.

#data-indexing#database#r-package
Stars245
Forks59
Last commit6 months ago
R Type Provider
R Type ProviderF#

An F# type provider that enables seamless interoperability with R packages, offering type-safe access to R functions from .NET.

#type-provider#type-providers#data-science
Stars244
Forks69
Last commit24 days ago
GWU: Data Mining (Decision Sciences 6279)
GWU: Data Mining (Decision Sciences 6279)Jupyter Notebook

Course materials for GWU's Data Mining and Machine Learning classes covering preprocessing, modeling, and practical Kaggle applications.

#data-science#kaggle#educational-materials
Stars241
Forks175
Last commit
RNeo4j
RNeo4jR

An R driver for Neo4j that enables reading and writing graph data directly from the R environment.

#neo4j-driver#r-package#data-science
Stars238
Forks64
Last commit7 years ago
scicloj.ml
scicloj.mlClojure

An idiomatic Clojure machine learning library providing a unified interface for classification, regression, and unsupervised models.

#metamorph#tech-ml-dataset#hyperparameter-optimization
Stars238
Forks16
Last commit7 months ago
RNeo4j
RNeo4jR

An R driver for Neo4j that enables reading and writing graph data directly from R.

#igraph#driver#data-science
Stars238
Forks64
Last commit7 years ago
BoostARoota
BoostARootaPython

A fast feature selection algorithm for tree-based models like XGBoost, designed to outperform Boruta in speed and performance.

#algorithm#datascientist#feature-selection
Stars233
Forks36
Last commit5 years ago
Topic Models Resources
Topic Models ResourcesR

A curated collection of learning resources, R packages, and practical examples for understanding and applying topic modeling techniques.

#document-analysis#text-analysis#data-science
Stars232
Forks54
Last commit
AtomAI
AtomAIPython

A PyTorch-based Python package for deep and machine learning analysis of microscopy data, designed for domain scientists.

#ensemble-learning#scientific-computing#image-analysis
Stars229
Forks42
Last commit11 months ago
git2r
git2rR

R bindings to the libgit2 library, providing programmatic access to Git repositories from R.

#version-control#r-package#data-science
Stars223
Forks61
Last commit3 months ago
theme-darcula
theme-darculaCSS

A Darcula theme for JupyterLab, modeled after the classic IntelliJ theme, with dark scrollbar support.

#jupyterlab-extension#jupyterlab-2#developer-tools
Stars220
Forks37
Last commit3 years ago
reshape2  <img class="emoji" alt="heart" src="https://cdn.jsdelivr.net/gh/qinwf/awesome-R@3c66da6e291bcc0520b1649125b0bed750896a9a/heart.png" height="20" align="absmiddle" width="20">
reshape2 <img class="emoji" alt="heart" src="https://cdn.jsdelivr.net/gh/qinwf/awesome-R@3c66da6e291bcc0520b1649125b0bed750896a9a/heart.png" height="20" align="absmiddle" width="20">R

An R package for flexibly rearranging, reshaping, and aggregating data, now superseded by tidyr.

#r-package#data-science#r-programming
Stars214
Forks56
Last commit
FSharp.Charting
FSharp.ChartingF#

A charting library designed for interactive data visualization in F# scripting environments.

#functional-programming#data-science#fsi
Stars211
Forks66
Last commit6 years ago
animation
animationR

An R package for creating and exporting statistical animations to HTML, GIF, video, and PDF formats.

#gif-generation#animation#r-package
Stars209
Forks59
Last commit9 months ago
V8
V8C++

An R interface to Google's V8 JavaScript and WebAssembly engine for executing JavaScript code within R.

#webassembly#javascript-engine#r-package
Stars207
Forks29
Last commit27 days ago
spatstat
spatstatR

A comprehensive family of R packages for analyzing spatial point pattern data and other spatial data types.

#statistical-models#epidemiology#spatstat
Stars207
Forks43
Last commit7 days ago
Rosetta
RosettaJupyter Notebook

A Python toolkit for text-focused data science on medium-sized datasets, bridging memory and cluster-scale processing.

#stream-processing#multiprocessing#scientific-computing
Stars207
Forks45
Last commit
Dataset
DatasetPython

A Python library for building lazy data processing and machine learning workflows that handle datasets larger than memory.

#pipeline-framework#batch-processing#workflow
Stars206
Forks45
Last commit20 days ago
visualize_ML
visualize_MLPython

A Python package for automated univariate and bivariate data analysis and visualization to streamline machine learning workflows.

#statistical-analysis#statisics#feature-selection
Stars205
Forks29
Last commit9 years ago
HASS-data-detective
HASS-data-detectivePython

A Python package for exploring and analyzing data from your Home Assistant database.

#iot#home-automation#data-science
Stars204
Forks39
Last commit2 months ago
ContainDS Dashboards
ContainDS DashboardsPython

A JupyterHub extension for publishing notebooks and apps as secure, interactive dashboards for non-technical audiences.

#dashboard-publishing#notebook-sharing#jupyterhub
Stars200
Forks37
Last commit1 year ago
Open Data
Open DataR

Archived R package for accessing open data from various government and scientific sources.

#r-package#data-science#scientific-data
Stars198
Forks46
Last commit4 years ago
DVClive
DVClivePython

A Python library for logging ML metrics, parameters, and models in simple file formats, compatible with DVC and Git.

#hacktoberfest#developer-tools#python-library
Stars192
Forks40
Last commit4 days ago
jut
jutPython

A command-line tool to view Jupyter notebooks directly in the terminal with customizable display options.

#python-rich#developer-tools#notebook-viewer
Stars192
Forks3
Last commit3 years ago
Panthera
PantheraClojure

A Clojure library providing data-frames and arrays through Python's pandas and numpy.

#array#data-science#dataframe
Stars191
Forks15
Last commit6 years ago
TIQ-test
TIQ-testR

A tool for data visualization and statistical analysis of threat intelligence indicator feeds to measure their quality and effectiveness.

#statistical-analysis#security-analytics#data-science
Stars178
Forks44
Last commit10 years ago
ChemML
ChemMLPython

A Python machine learning and informatics suite for analyzing, mining, and modeling chemical and materials data.

#data-science#deep-learning#materials-science
Stars176
Forks33
Last commit1 month ago
LiFT
LiFTScala

A Scala/Spark library for measuring fairness and mitigating bias in large-scale machine learning workflows.

#fairness-ml#apache-spark#spark
Stars175
Forks21
Last commit5 months ago
CometML
CometMLJupyter Notebook

A collection of examples demonstrating how to use Comet.ml for machine learning experiment tracking across various Python frameworks.

#deep-learning-libraries#data-science#deep-learning
Stars174
Forks65
Last commit19 days ago
HFT_Bitcoin
HFT_BitcoinJupyter Notebook

Analysis of High Frequency Trading patterns and strategies on Bitcoin exchanges using Jupyter notebooks.

#market-microstructure#high-frequency-trading#bitcoin-analysis
Stars172
Forks46
Last commit8 years ago
Awesome Credit Modeling
Awesome Credit Modeling

A curated collection of academic papers, articles, and resources on credit scoring and credit risk modeling techniques.

#data-science#research-papers#statistical-classification
Stars171
Forks28
Last commit2 years ago
bayesloop
bayesloopPython

A Python probabilistic programming framework for objective model selection in time-varying parameter time series models.

#grid-based-inference#scientific-computing#sequential-inference
Stars169
Forks30
Last commit1 month ago
Big Data For Chimps
Big Data For ChimpsRuby

A practical guide to exploratory data analytics using Hadoop with Pig and Ruby for terabyte-scale data processing.

#exploratory-analysis#data-science#terabyte-processing
Stars169
Forks63
Last commit
PreviousPage 12 of 15

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub
1 year ago
10 years ago
6 months ago
3 years ago
11 years ago
Next
#Machine Learning288
#Python245
#Deep Learning84
#Data Analysis79
#Data Visualization79
#Statistics61
#Python Library55
#Jupyter Notebook53
#R52
#Jupyter49
#Scikit Learn48
#Pandas43