Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Categories
  3. Big Data
  4. Streaming

Streaming

The "Awesome Streaming" project is a curated collection of resources focused on streaming technologies, which enable the real-time processing and distribution of data. This list encompasses a variety of categories including frameworks, libraries, tools, tutorials, and community resources that cater to different streaming protocols and architectures. It is beneficial for developers, data engineers, and researchers who are looking to implement or enhance streaming solutions in their applications. With a wealth of information and tools at your disposal, users can explore innovative ways to manage and analyze streaming data effectively.

data-streamingreal-time-processingstreaming-frameworksstreaming-toolsdata-engineeringevent-streamingstreaming-tutorials
RSSView on GitHub
3.0k stars312 forks0 contributorsUpdated
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub

Related Awesome Lists

📦
Public Datasets

The "Awesome Public Datasets" project is a curated collection of publicly available datasets across various domains, including government, healthcare, finance, and social sciences. This list features datasets in multiple formats, along with links to tools and platforms that facilitate data analysis and visualization. It is an invaluable resource for researchers, data scientists, and students looking to access high-quality data for their projects or studies. By providing a wide array of datasets, this collection empowers users to explore, analyze, and derive insights from real-world data. Dive in to discover the wealth of information available for your next data-driven endeavor!

73.8k
📦
Big Data

The "Awesome Big Data" project is a curated collection of resources focused on big data technologies and practices that enable the processing and analysis of vast amounts of data. This list encompasses a variety of categories, including frameworks, tools, libraries, databases, and tutorials that cater to both beginners and experienced data professionals. Users can explore resources related to data storage, processing, analytics, and visualization, making it an invaluable asset for data scientists, engineers, and researchers. Whether you're looking to enhance your big data skills or find the right tools for your projects, this collection provides a comprehensive guide to navigating the big data landscape.

14.3k
📦
Data Engineering

The "Awesome Data Engineering" project is a curated collection of resources aimed at supporting professionals in the field of data engineering, which involves the design and construction of systems for collecting, storing, and analyzing data. This list encompasses a variety of categories, including data pipelines, ETL tools, data warehousing solutions, frameworks, and best practices, as well as tutorials and community resources. Whether you are a beginner looking to understand the fundamentals or an experienced engineer seeking advanced techniques, this list offers valuable insights and tools to enhance your data engineering projects. Dive into this collection to discover the tools and methodologies that can streamline your data workflows and improve your data management capabilities.

8.5k
📦
Network Analysis

The "Awesome Network Analysis" project is a curated collection of resources focused on the study and analysis of networks, which are structures made up of interconnected elements. This list encompasses a variety of tools, libraries, datasets, and tutorials that facilitate the exploration of network theory, graph analysis, and visualization techniques. It serves as a valuable resource for researchers, data scientists, and enthusiasts interested in understanding complex systems, social networks, and data relationships. Whether you are a beginner looking to grasp the basics or an experienced analyst seeking advanced methodologies, this collection provides essential tools and insights to enhance your network analysis projects.

4.0k

Table of Contents

12 sections · 102 projects

Engine

27 projects
Apache ApexApache Apex

A unified platform for big data stream and batch processing on Hadoop YARN with enterprise-grade operability.

#stream-processing#batch-processing#real-time-analytics
Stars350
Forks170
Last commit5 years ago
Apache BallistaApache Ballista

A distributed query execution engine that extends Apache DataFusion to run SQL queries in parallel across multiple nodes.

#parallel-computing#distributed#dataframe
Stars2,057
Forks284
Last commit1 day ago
Apache Heron (incubating)Apache Heron (incubating)

Apache Heron is a real-time, distributed, fault-tolerant stream processing engine developed by Twitter.

#stream-processing#real-time-analytics#distributed-systems
Stars3,632
Forks583
Last commit3 years ago
Apache SamzaApache Samza

A distributed stream processing framework built on Apache Kafka and Apache Hadoop YARN for fault-tolerant, stateful processing.

#real-time-analytics#apache-yarn#samza
Stars843
Forks332
Last commit23 days ago
ArkFlowArkFlow

A high-performance Rust stream processing engine with integrated AI capabilities for real-time data processing and intelligent analysis.

#stream-processing#event-driven#ai-integration
Stars1,283
Forks43
Last commit9 days ago
ArroyoArroyo

A distributed stream processing engine in Rust that performs stateful computations on real-time data with subsecond results.

#stream-processing#event-driven#sql-engine
Stars4,933
Forks361
Last commit2 days ago
AthenaXAthenaX

SQL-based streaming analytics platform that scales to process hundreds of billions of real-time events daily.

#apache-flink#event-processing#flink
Stars1,223
Forks282
Last commit6 years ago
BytewaxBytewax

A Python framework and Rust-based distributed processing engine for stateful event and stream processing.

#stream-processing#event-driven#data-science
Stars2,016
Forks109
Last commit18 days ago
CocoIndexCocoIndex

An ultra-performant data transformation framework for AI, with incremental processing and data lineage built-in.

#data-indexing#semantic-search#data-lineage
Stars10,215
Forks801
Last commit12 hours ago
GearpumpGearpump

A lightweight real-time big data streaming engine built on Akka for high-throughput, low-latency data processing.

#stream-processing#akka#cluster-computing
Stars758
Forks150
Last commit5 days ago
Hazelcast JetHazelcast Jet

An open-source, in-memory, distributed batch and stream processing engine for Java applications.

#stream-processing#event-processing#hacktoberfest
Stars1,111
Forks203
Last commit1 year ago
hailstormhailstorm

A distributed stream processing system written in Haskell that guarantees exactly-once semantics.

#stream-processing#haskell#real-time-analytics
Stars93
Forks8
Last commit12 years ago
mantismantis

A platform for building realtime, cost-effective, operations-focused applications.

#stream-processing#realtime-analytics#gradle
Stars1,463
Forks221
Last commit2 days ago
mupd8(muppet)mupd8(muppet)

A MapReduce-style framework for processing fast/streaming data, implementing the MapUpdate model.

#stream-processing#mapreduce#data-framework
Stars128
Forks35
Last commit5 years ago
NebulaStreamNebulaStream

An end-to-end data management system for IoT, optimizing stream processing across cloud, edge, and sensor deployments.

#stream-processing#sql-engine#streamprocessing
Stars84
Forks35
Last commit15 hours ago
NumaflowNumaflow

A Kubernetes-native, serverless platform for running massively parallel data and streaming jobs with exactly-once semantics.

#stream-processing#hacktoberfest#event-driven-architecture
Stars2,487
Forks154
Last commit18 hours ago
OnyxOnyx

A masterless, cloud-scale, fault-tolerant distributed computation system for batch and stream processing written in Clojure.

#stream-processing#batch-processing#distributed
Stars2,049
Forks200
Last commit6 years ago
PathwayPathway

A Python ETL framework for stream processing, real-time analytics, and building live LLM/RAG pipelines, powered by a scalable Rust engine.

#stream-processing#batch-processing#machine-learning-algorithms
Stars63,065
Forks1,679
Last commit1 day ago
Scramjet Cloud PlatformScramjet Cloud Platform

A runtime supervisor for deploying and running data processing programs called Sequences on Linux servers, Docker, and Kubernetes clusters.

#stream-processing#runtime-supervisor#serverless
Stars69
Forks7
Last commit1 year ago
tigontigon

An open-source real-time stream processing framework combining high-throughput event processing with low-latency SQL-like streaming queries.

#stream-processing#event-processing#real-time-analytics
Stars284
Forks33
Last commit9 years ago
TrillTrill

A high-performance one-pass in-memory streaming analytics engine for temporal and streaming data.

#query-processor#real-time-processing#in-memory-engine
Stars1,266
Forks132
Last commit2 years ago
WallarooWallaroo

A fast, resilient distributed stream processing framework that simplifies real-time data applications with high performance and easy scaling.

#stream-processing#api#high-performance
Stars1,485
Forks67
Last commit5 years ago
LightSaberLightSaber

A multi-core stream processing engine for high-throughput window aggregation with optional exactly-once fault tolerance.

#stream-processing#multi-core#window-aggregation
Stars74
Forks19
Last commit4 years ago
HStreamDBHStreamDB

An open-source, cloud-native streaming database designed for real-time data processing and IoT applications.

#stream-processing#iot#haskell
Stars723
Forks54
Last commit1 year ago
KuiperKuiper

A lightweight IoT data analytics and stream processing engine for resource-constrained edge devices.

#stream-processing#iot#rule-engine
Stars1,711
Forks457
Last commit4 days ago
WindFlow

paragroup.github.io
RisingWaveRisingWave

An enterprise-grade event streaming platform that ingests, processes, and manages real-time event data with PostgreSQL compatibility and Apache Iceberg™ integration.

#stream-processing#etl-pipeline#database
Stars9,067
Forks775
Last commit10 hours ago

Library

13 projects
Apache Kafka StreamsApache Kafka Streams

A distributed event streaming platform for building high-performance data pipelines, streaming analytics, and data integration.

#stream-processing#message-queue#data-integration
Stars32,731
Forks15,258
Last commit21 hours ago
StreamizStreamiz

A .NET stream processing library for Apache Kafka, providing a Kafka Streams-like API for building real-time applications.

#stream-processing#event-driven#kafka-streams-dotnet
Stars534
Forks80
Last commit1 month ago
Akka StreamsAkka Streams

A platform for building highly responsive, resilient, and scalable distributed systems using the actor model.

#stream-processing#distributed-actors#akka
Stars13,276
Forks3,547
Last commit4 days ago
BenthosBenthos

A high-performance, resilient stream processor that connects various sources and sinks, performs data transformations, and guarantees at-least-once delivery.

#stream-processing#declarative-config#cqrs
Stars8,678
Forks944
Last commit9 hours ago
FS2(prev. 'Scalaz-Stream')FS2(prev. 'Scalaz-Stream')

A purely functional, effectful, and polymorphic stream processing library for Scala built on Cats and Cats-Effect.

#stream-processing#functional-programming#scala-js
Stars2,448
Forks630
Last commit7 days ago
FastStreamFastStream

An asynchronous Python framework for building services that interact with Apache Kafka, RabbitMQ, NATS, and Redis event streams.

#stream-processing#asyncio#redis
Stars5,207
Forks356
Last commit1 day ago
monixmonix

A high-performance Scala library for composing asynchronous, event-based programs with strong functional programming influences.

#stream-processing#back-pressure#functional-programming
Stars1,934
Forks244
Last commit6 days ago
Quix StreamsQuix Streams

A Python framework for building real-time data pipelines and event-driven microservices on Apache Kafka using a Streaming DataFrame API.

#stream-processing#streaming-data-processing#event-driven-architecture
Stars1,554
Forks106
Last commit3 days ago
StreamlineStreamline

A visual development platform for building, deploying, and managing streaming analytics applications with multiple engine bindings.

#stream-processing#flink#storm
Stars167
Forks95
Last commit2 years ago
SubstationSubstation

A serverless toolkit for routing, normalizing, and enriching security event and audit logs in AWS.

#aws-serverless#observability#log-enrichment
Stars402
Forks33
Last commit4 months ago
TributaryTributary

A Python library for constructing reactive dataflow graphs and streaming computations as data models.

#real-time-processing#data-modeling#python-data-streams
Stars463
Forks38
Last commit1 month ago
YoMoYoMo

An open-source LLM function calling framework for building scalable, low-latency AI agents with geo-distributed edge infrastructure.

#realtime#stream-processing#distributed-cloud
Stars1,906
Forks143
Last commit3 days ago
MediaPipeMediaPipe

Cross-platform framework for building customizable on-device machine learning pipelines for live and streaming media.

#media-processing#video-processing#on-device-ml
Stars35,511
Forks6,005
Last commit3 days ago

Application

3 projects
strawstraw

A scalable real-time search platform for streaming data using Apache Storm, Kafka, and Lucene.

#redis#storm#luwak
Stars102
Forks21
Last commit10 years ago
storm-crawlerstorm-crawler

A scalable, mature, and versatile web crawler built on Apache Storm for building low-latency, distributed crawling systems.

#distributed#real-time-processing#distributed-systems
Stars979
Forks277
Last commit3 days ago
ZillaZilla

A stateless, multi-protocol proxy that bridges web apps, IoT devices, and microservices directly to Apache Kafka via declarative APIs.

#api-gateway#rest#event-driven-architecture
Stars690
Forks70
Last commit9 hours ago

IoT

3 projects
sensorbeesensorbee

A lightweight stream processing engine designed specifically for IoT data processing and analytics.

#stream-processing#iot#lightweight-engine
Stars230
Forks43
Last commit6 years ago
Apache EdgentApache Edgent

An open source programming model and runtime for analyzing data and events on edge devices, reducing data transmission and storage costs.

#iot#embedded-systems#real-time-processing
Stars222
Forks134
Last commit6 years ago
Apache StreamPipesApache StreamPipes

A self-service IoT toolbox enabling non-technical users to connect, analyze, and explore industrial IoT data streams.

#stream-processing#iot#mqtt
Stars725
Forks236
Last commit1 day ago

DSL

2 projects
EsperEsper

A Java/.NET component for complex event processing (CEP), streaming SQL, and event series analysis.

#stream-processing#compiler#open-source
Stars876
Forks264
Last commit2 years ago
summingbirdsummingbird

A library for writing MapReduce programs that execute on distributed platforms like Storm and Scalding using Scala/Java collection-like syntax.

#stream-processing#mapreduce#batch-processing
Stars2,125
Forks256
Last commit4 years ago