Data Science

624 projects

Showing 36 of 624 projects

A Python package for automated univariate and bivariate data analysis and visualization to streamline machine learning workflows.

#statistical-analysis#statisics#feature-selection

Stars205

Forks29

Last commit9 years ago

HASS-data-detectivePython

A Python package for exploring and analyzing data from your Home Assistant database.

#iot#home-automation#data-science

Stars203

Forks39

Last commit3 months ago

ContainDS DashboardsPython

A JupyterHub extension for publishing notebooks and apps as secure, interactive dashboards for non-technical audiences.

#dashboard-publishing#notebook-sharing#jupyterhub

Stars199

Forks37

Last commit1 year ago

Open DataR

Archived R package for accessing open data from various government and scientific sources.

#r-package#data-science#scientific-data

Stars198

Forks46

Last commit4 years ago

DVClivePython

A Python library for logging ML metrics, parameters, and models in simple file formats, compatible with DVC and Git.

#hacktoberfest#developer-tools#python-library

Stars196

Forks40

Last commit4 days ago

jutPython

A command-line tool to view Jupyter notebooks directly in the terminal with customizable display options.

#python-rich#developer-tools#notebook-viewer

Stars192

Forks3

Last commit4 years ago

PantheraClojure

A Clojure library providing data-frames and arrays through Python's pandas and numpy.

#array#data-science#dataframe

Stars191

Forks15

Last commit6 years ago

TIQ-testR

A tool for data visualization and statistical analysis of threat intelligence indicator feeds to measure their quality and effectiveness.

#statistical-analysis#security-analytics#data-science

Stars179

Forks44

Last commit10 years ago

ChemMLPython

A Python machine learning and informatics suite for analyzing, mining, and modeling chemical and materials data.

#data-science#deep-learning#materials-science

Stars178

Forks34

Last commit11 days ago

CometMLJupyter Notebook

A collection of examples demonstrating how to use Comet.ml for machine learning experiment tracking across various Python frameworks.

#deep-learning-libraries#data-science#deep-learning

Stars176

Forks67

Last commit25 days ago

HFT_BitcoinJupyter Notebook

Analysis of High Frequency Trading patterns and strategies on Bitcoin exchanges using Jupyter notebooks.

#market-microstructure#high-frequency-trading#bitcoin-analysis

Stars174

Forks46

Last commit9 years ago

LiFTScala

A Scala/Spark library for measuring fairness and mitigating bias in large-scale machine learning workflows.

#fairness-ml#apache-spark#spark

Stars173

Forks22

Last commit7 months ago

Awesome Credit Modeling

A curated collection of academic papers, articles, and resources on credit scoring and credit risk modeling techniques.

#data-science#research-papers#statistical-classification

Stars172

Forks28

Last commit2 years ago

bayesloopPython

A Python probabilistic programming framework for objective model selection in time-varying parameter time series models.

#grid-based-inference#scientific-computing#sequential-inference

Stars171

Forks31

Last commit1 month ago

Big Data For ChimpsRuby

A practical guide to exploratory data analytics using Hadoop with Pig and Ruby for terabyte-scale data processing.

#exploratory-analysis#data-science#terabyte-processing

Julia package providing easy access to 700+ standard R datasets for data analysis and statistical learning.

#julia#data-science#statistics

Stars167

Forks54

Last commit1 month ago

Jupyter Notebook REST APIJupyter Notebook

Run Jupyter notebooks as REST API endpoints, enabling programmatic execution of notebook workflows.

#fastapi#uvicorn#data-science

Stars166

Forks11

Last commit3 years ago

checkpointR

An R package that installs packages from MRAN snapshots to ensure reproducible environments by locking package versions to a specific date.

#mran#version-control#r-package

Stars165

Forks37

Last commit4 years ago

jRuby MahoutRuby

A JRuby gem that provides Ruby-friendly access to Apache Mahout's scalable machine learning capabilities for recommendations.

#jruby#data-science#ruby-gem

Stars165

Forks14

Last commit11 years ago

ClojisRClojure

A bridge library enabling Clojure to call R functions and use R objects for statistical computing and data science.

#data-science#rlang#r-language

Stars162

Forks11

Last commit1 month ago

OpenBioLinkPython

A resource and evaluation framework for benchmarking link prediction models on large-scale, heterogeneous biomedical knowledge graphs.

#heterogeneous-graphs#knowledge-graphs#data-science

Stars161

Forks24

Last commit2 years ago

DistributedRR

A scalable high-performance platform for R that enables large-scale machine learning, statistical analysis, and graph processing across clusters.

#statistical-analysis#graph-processing#high-performance-computing

A Julia package for reproducible data setup, automating dataset downloads and management for scientific computing.

#scientific-computing#dataset-download#julia

Stars160

Forks45

Last commit22 days ago

A list of colleges and universities offering degrees in data science.Python

A curated list of colleges and universities worldwide offering data science degrees.

#higher-education#universities#data-science

Stars159

Forks195

Last commit

open-solution-data-science-bowl-2018Python

Open-source implementation of the winning solution for the 2018 Data Science Bowl Kaggle competition using PyTorch and U-Net.

#data-science#kaggle#deep-learning

A simple machine learning framework written in Swift, currently focusing on regression algorithms.

#genetic-algorithms#machine-learning-library#data-science

Stars154

Forks14

Last commit8 years ago

open-solution-toxic-commentsPython

An open-source starter solution for the Kaggle Toxic Comment Classification Challenge, providing ready-to-use machine learning pipelines for detecting online harassment.

#ensemble-learning#text-classification#data-science

Stars154

Forks55

Last commit

Automatically Dockerize A Data-Science Repo As A Jupyter ServerShell

A GitHub Action to build and push Jupyter-enabled Docker images from data science repositories using repo2docker.

#actions#containerization#devops

An open platform for hosting and participating in data science challenges focused on open science and open data.

#data-science#open-science#challenge-platform

Stars152

Forks30

Last commit3 years ago

treebeardTypeScript

A GitHub Action that automatically tests Jupyter notebooks from top to bottom using nbmake and pytest.

#scientific-computing#pytest#notebook

Stars151

Forks8

Last commit4 years ago

nteractTypeScript

A desktop application for interactive computing with Jupyter notebooks, supporting multiple kernels and rich outputs.

#desktop-application#scientific-computing#interactive-computing

An R package providing 2,260 network datasets in igraph format from diverse sources like social networks, animal interactions, and movie co-stars.

#igraph#r-package#data-science

Stars146

Forks16

Last commit3 months ago

RJuliaC

An R package that provides a bidirectional interface for calling Julia code from R and mapping objects between both languages.

#scientific-computing#julia#r-package

Stars145

Forks23

Last commit8 years ago

elusionRust

A Rust DataFrame and data engineering library with PySpark/SQL-like syntax, built for business data pipelines with Microsoft stack integration.

#pyspark-alternative#sql-like#data-science

Stars143

Forks4

Last commit3 months ago

StatsJulia

A convenience meta-package that loads essential Julia packages for statistics with a single import.

#julia#meta-package#data-science

Stars143

Forks14

Last commit3 years ago

doddle-modelScala

An in-memory machine learning library for Scala with a scikit-learn-like API, built on Breeze for parallel and distributed systems.

#parallel-computing#in-memory#data-science

Stars139

Forks22

Last commit1 year ago

PreviousPage 13 of 18

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub