Data Science

624 projects

Showing 36 of 624 projects

R package containing datasets and code examples for the book 'Statistical Analysis of Network Data with R, 2nd Edition'.

#statistical-analysis#igraph#r-package

Stars300

Forks188

Last commit6 years ago

terraform-provider-iterativeGo

A Terraform plugin for managing machine learning compute resources across AWS, GCP, Azure, and Kubernetes with spot instance recovery and auto-termination.

#developer-tools#devops#multi-cloud

An R package for automatic optimal predictor ensembling via cross-validation with dozens of machine learning algorithms.

#ensemble-learning#parallel-computing#hyperparameter-optimization

Stars294

Forks76

Last commit7 months ago

GeniClojure

An idiomatic Clojure dataframe library that runs on Apache Spark, providing a seamless interface for data processing and machine learning.

#apache-spark#high-performance-computing#spark

Stars294

Forks26

Last commit2 years ago

MazePython

An application-oriented Deep Reinforcement Learning framework for real-world decision problems, covering simulation to deployment.

#hydra-config#simulation#distributed

Stars293

Forks12

Last commit1 month ago

ipycytoscapePython

A Jupyter widget for interactive graph visualization using cytoscape.js in notebooks and JupyterLab.

#notebook-tools#data-science#cytoscape

Stars290

Forks61

Last commit18 days ago

JuliaCallHTML

An R package that embeds Julia for high-performance numerical computing, enabling seamless interoperability between R and Julia.

#scientific-computing#julia#high-performance

Stars287

Forks42

Last commit23 days ago

mlr3bookTeX

Free online version of the 'Applied Machine Learning Using mlr3 in R' textbook, built with Quarto.

#bookdown#mlr3#data-science

Stars281

Forks69

Last commit11 days ago

rb-libsvmC++

Ruby language bindings for the LIBSVM library, enabling support vector machine (SVM) classification and regression in Ruby.

#libsvm#svm-learning#ruby-bindings

Stars279

Forks34

Last commit2 years ago

pandaset-devkitJupyter Notebook

A Python devkit for loading, exploring, and manipulating the PandaSet, a large-scale autonomous driving dataset with LiDAR, camera, and annotations.

#lidar#autonomous-driving#sensor-fusion

Stars278

Forks74

Last commit2 years ago

R Books ListR

A curated, categorized collection of books about the R programming language for data science, statistics, and visualization.

#data-science#statistics#r-programming

Stars278

Forks29

Last commit8 years ago

R BooksR

A curated, categorized collection of books about the R programming language for data science, statistics, and visualization.

#data-science#statistics#r-programming

Stars278

Forks29

Last commit8 years ago

pysparklingPython

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

#apache-spark#data-science#python

Stars270

Forks45

Last commit1 year ago

ProbablySwift

A Swift library providing probability distributions and statistical functions for probabilistic computing.

#data-science#statistics#swift-package-manager

Stars268

Forks9

Last commit9 years ago

machine learningJavaScript

A web interface and REST API for classification and regression using Support Vector Machine (SVM) and Support Vector Regression (SVR) algorithms.

#flask#data-science#rest-api

Stars258

Forks83

Last commit5 years ago

PyODDSPython

An end-to-end Python outlier detection system with database support, automated machine learning, and unified APIs for statistical, ML, and deep learning models.

#statistical-analysis#database#python-library

Stars255

Forks39

Last commit3 years ago

ferryPython

Define, run, and deploy big data applications on AWS, OpenStack, and local machines using Docker.

#devops#spark#data-science

Stars254

Forks25

Last commit11 years ago

elasticR

R client for the Elasticsearch HTTP API, enabling data indexing, search, and analysis from R.

#data-indexing#database#r-package

Stars245

Forks59

Last commit7 months ago

R Type ProviderF#

An F# type provider that enables seamless interoperability with R packages, offering type-safe access to R functions from .NET.

#type-provider#type-providers#data-science

Stars244

Forks69

Last commit2 months ago

imbalanced-algorithmsPython

Python-based implementations of algorithms for learning on imbalanced data.

#imbalanced-data#notre-dame#data-science

Stars241

Forks67

Last commit4 years ago

GWU: Data Mining (Decision Sciences 6279)Jupyter Notebook

Course materials for GWU's Data Mining and Machine Learning classes covering preprocessing, modeling, and practical Kaggle applications.

#data-science#kaggle#educational-materials

An idiomatic Clojure machine learning library providing a unified interface for classification, regression, and unsupervised models.

#metamorph#tech-ml-dataset#hyperparameter-optimization

Stars239

Forks15

Last commit8 months ago

RNeo4jR

An R driver for Neo4j that enables reading and writing graph data directly from the R environment.

#neo4j-driver#r-package#data-science

Stars238

Forks64

Last commit7 years ago

RNeo4jR

An R driver for Neo4j that enables reading and writing graph data directly from R.

#igraph#driver#data-science

Stars238

Forks64

Last commit7 years ago

Topic Models ResourcesR

A curated collection of learning resources, R packages, and practical examples for understanding and applying topic modeling techniques.

#document-analysis#text-analysis#data-science

A fast feature selection algorithm for tree-based models like XGBoost, designed to outperform Boruta in speed and performance.

#algorithm#datascientist#feature-selection

Stars232

Forks36

Last commit24 days ago

AtomAIPython

A PyTorch-based Python package for deep and machine learning analysis of microscopy data, designed for domain scientists.

#ensemble-learning#scientific-computing#image-analysis

Stars229

Forks42

Last commit1 year ago

git2rR

R bindings to the libgit2 library, providing programmatic access to Git repositories from R.

#version-control#r-package#data-science

Stars222

Forks61

Last commit5 months ago

theme-darculaCSS

A Darcula theme for JupyterLab, modeled after the classic IntelliJ theme, with dark scrollbar support.

#jupyterlab-extension#jupyterlab-2#developer-tools

Stars219

Forks37

Last commit3 years ago

reshape2 <img class="emoji" alt="heart" src="https://cdn.jsdelivr.net/gh/qinwf/awesome-R@3c66da6e291bcc0520b1649125b0bed750896a9a/heart.png" height="20" align="absmiddle" width="20">R

An R package for flexibly rearranging, reshaping, and aggregating data, now superseded by tidyr.

#r-package#data-science#r-programming

A charting library designed for interactive data visualization in F# scripting environments.

#functional-programming#data-science#fsi

Stars210

Forks66

Last commit6 years ago

animationR

An R package for creating and exporting statistical animations to HTML, GIF, video, and PDF formats.

#gif-generation#animation#r-package

Stars209

Forks58

Last commit11 months ago

RosettaJupyter Notebook

A Python toolkit for text-focused data science on medium-sized datasets, bridging memory and cluster-scale processing.

#stream-processing#multiprocessing#scientific-computing

An R interface to Google's V8 JavaScript and WebAssembly engine for executing JavaScript code within R.

#webassembly#javascript-engine#r-package

Stars207

Forks30

Last commit2 months ago

spatstatR

A comprehensive family of R packages for analyzing spatial point pattern data and other spatial data types.

#statistical-models#epidemiology#spatstat

Stars206

Forks43

Last commit3 days ago

DatasetPython

A Python library for building lazy data processing and machine learning workflows that handle datasets larger than memory.

#pipeline-framework#batch-processing#workflow

Stars206

Forks45

Last commit14 days ago

PreviousPage 12 of 18

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub