Showing 17 of 17 projects
A distributed database for high-performance computing with in-memory speed, ACID compliance, and ANSI SQL support.
A Scala API for Cascading that simplifies writing Hadoop MapReduce jobs with Scala integration.
A distributed computation system written in Go for parallel and cluster processing, similar to Hadoop MapReduce and Spark.
A library for writing MapReduce programs that execute on distributed platforms like Storm and Scalding using Scala/Java collection-like syntax.
Native integration library for using Elasticsearch with Hadoop, Spark, and Hive for real-time search and analytics on big data.
A library enabling MongoDB to serve as input source or output destination for Hadoop MapReduce tasks and ecosystem tools.
A collection of R packages for interacting with Hadoop ecosystems, enabling big data analysis from R.
A Clojure DSL for Apache Spark that enables distributed data processing using idiomatic Clojure.
A Python wrapper for Cascading that enables building and controlling Hadoop data processing workflows entirely in Python.
A Hadoop library for reading and processing packet capture (PCAP) files in MapReduce jobs and Hive queries.
A Groovy wrapper for the MongoDB Java driver providing a simpler, less verbose API.
A Haskell driver for MongoDB that enables database connections, queries, updates, and administrative tasks.
A MapReduce-style framework for processing fast/streaming data, implementing the MapUpdate model.
A collection of libraries for large-scale data processing in Hadoop ecosystems, including Spark, Pig, and incremental MapReduce.
A collection of interactive Jupyter notebooks for learning Hadoop, Spark, and MapReduce with hands-on tutorials and demos.
A production-grade HBase ORM library for clean, fast, and fun object-oriented data access, also compatible with Google Cloud Bigtable.
Mozilla's utility library for Hadoop, HBase, Pig, and related big data technologies.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.