Showing 19 of 19 projects
Enables distributed TensorFlow training and inferencing on Apache Spark and Hadoop clusters with minimal code changes.
A high-performance distributed map/reduce system with DAG execution, written in Go, supporting standalone or distributed modes.
A distributed computation system written in Go for parallel and cluster processing, similar to Hadoop MapReduce and Spark.
An open-source implementation of the Message Passing Interface (MPI) specification for high-performance computing.
A cluster computing framework for processing large-scale geospatial data within Apache Spark, Flink, and other big data systems.
A blazingly fast, low-latency actor engine written in Go for building highly concurrent and distributed systems.
A distributed map-reduce framework for parallel computations over large datasets on unreliable computer clusters.
A genomics analysis platform that uses Apache Spark to parallelize genomic data processing across clusters, replacing traditional file-based workflows.
A lightweight real-time big data streaming engine built on Akka for high-throughput, low-latency data processing.
A Docker image for Apache Spark on YARN, built on Hadoop and CentOS for easy deployment.
An R package providing a lightweight frontend to use Apache Spark for distributed data processing from R.
A Clojure DSL for Apache Spark that enables distributed data processing using idiomatic Clojure.
A distributed video processing platform built on Apache Storm with OpenCV integration for large-scale computer vision pipelines.
A scalable high-performance platform for R that enables large-scale machine learning, statistical analysis, and graph processing across clusters.
An Apache Mesos framework for building Docker images across a cluster of machines, enabling scalable container builds.
An R extension for distributed computing using Apache Hive, enabling HQL queries in R and R functions in Hive.
Run MPI programs on Hadoop YARN clusters using MPICH-3.1.2 and SSH for distributed computing.
A Common Lisp library for distributing computational tasks across multiple machines using the lparallel API.
A distributed load testing solution that enables running Gatling simulations across a cluster of machines for high-scale performance testing.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.