Showing 36 of 506 projects
Run Jupyter notebooks as REST API endpoints, enabling programmatic execution of notebook workflows.
Julia package providing easy access to 700+ standard R datasets for data analysis and statistical learning.
A JRuby gem that provides Ruby-friendly access to Apache Mahout's scalable machine learning capabilities for recommendations.
An R package that installs packages from MRAN snapshots to ensure reproducible environments by locking package versions to a specific date.
A scalable high-performance platform for R that enables large-scale machine learning, statistical analysis, and graph processing across clusters.
A resource and evaluation framework for benchmarking link prediction models on large-scale, heterogeneous biomedical knowledge graphs.
A Julia package for reproducible data setup, automating dataset downloads and management for scientific computing.
A bridge library enabling Clojure to call R functions and use R objects for statistical computing and data science.
A curated list of colleges and universities worldwide offering data science degrees.
Open-source implementation of the winning solution for the 2018 Data Science Bowl Kaggle competition using PyTorch and U-Net.
An open-source starter solution for the Kaggle Toxic Comment Classification Challenge, providing ready-to-use machine learning pipelines for detecting online harassment.
A simple machine learning framework written in Swift, currently focusing on regression algorithms.
An open platform for hosting and participating in data science challenges focused on open science and open data.
A GitHub Action to build and push Jupyter-enabled Docker images from data science repositories using repo2docker.
A GitHub Action that automatically tests Jupyter notebooks from top to bottom using nbmake and pytest.
An R package providing 2,260 network datasets in igraph format from diverse sources like social networks, animal interactions, and movie co-stars.
An R package that provides a bidirectional interface for calling Julia code from R and mapping objects between both languages.
A convenience meta-package that loads essential Julia packages for statistics with a single import.
A Rust DataFrame and data engineering library with PySpark/SQL-like syntax, built for business data pipelines with Microsoft stack integration.
An in-memory machine learning library for Scala with a scikit-learn-like API, built on Breeze for parallel and distributed systems.
Course materials for UCLA's STATS 418 - Tools in Data Science covering R packages, machine learning libraries, databases, and reproducibility tools.
A lightweight Python library for building reproducible machine learning pipelines with minimal interface constraints.
A Python library for interacting with QuPath, providing a pythonic interface to manage and analyze digital pathology projects.
A machine learning library for Clojure built on top of Weka, providing filters, classifiers, regression, and clustering algorithms.
A Swift library for numerical computing with numpy-like APIs and Jupyter-like playground notebooks.
A GitHub template for automating machine learning workflows on Azure using GitHub Actions.
A Python library that brings Chart.js interactive charts to Jupyter notebooks with a familiar API.
Example code and materials demonstrating practical applications of SAS machine learning techniques.
Example code and materials demonstrating practical applications of SAS machine learning techniques.
A PHP framework for building complex recommendation engines on top of Neo4j graph databases.
A desktop application for interactive computing with Jupyter notebooks, supporting multiple kernels and rich outputs.
A Python library implementing fairness-aware machine learning algorithms for measuring and mitigating discrimination in predictive models.
An open-source benchmark solution for the Kaggle TGS Salt Identification Challenge using semantic segmentation.
A Ruby interface to XGBoost, providing high-performance gradient boosting for machine learning tasks.
A Julia library providing a consistent API for common machine learning algorithms, designed for practitioners working with in-memory datasets.
A declarative data-flow programming framework built on Zenoh for building applications that span from cloud to edge devices.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.