Streaming

The "Awesome Streaming" project is a curated collection of resources focused on streaming technologies, which enable the real-time processing and distribution of data. This list encompasses a variety of categories including frameworks, libraries, tools, tutorials, and community resources that cater to different streaming protocols and architectures. It is beneficial for developers, data engineers, and researchers who are looking to implement or enhance streaming solutions in their applications. With a wealth of information and tools at your disposal, users can explore innovative ways to manage and analyze streaming data effectively.

data-streamingreal-time-processingstreaming-frameworksstreaming-toolsdata-engineeringevent-streamingstreaming-tutorials

RSS View on GitHub

3.0k stars312 forks0 contributorsUpdated

Engine

10 projects

Apache Ballista

A distributed query execution engine that extends Apache DataFusion to run SQL queries in parallel across multiple nodes.

#parallel-computing#distributed#dataframe

Stars2,021

Forks270

Last commit2 days ago

Apache Heron (incubating)

Apache Heron is a real-time, distributed, fault-tolerant stream processing engine developed by Twitter.

#stream-processing#real-time-analytics#distributed-systems

Stars3,647

Forks583

Last commit3 years ago

Arroyo

A distributed stream processing engine in Rust that performs stateful computations on real-time data with subsecond results.

#stream-processing#event-driven#sql-engine

Stars4,885

Forks351

Last commit2 days ago

Bytewax

A Python framework and Rust-based distributed processing engine for stateful event and stream processing.

#stream-processing#event-driven#data-science

Stars1,980

Forks107

Last commit1 year ago

CocoIndex

An ultra-performant data transformation framework for AI, with incremental processing and data lineage built-in.

#data-indexing#semantic-search#data-lineage

A Kubernetes-native, serverless platform for running massively parallel data and streaming jobs with exactly-once semantics.

#stream-processing#hacktoberfest#event-driven-architecture

A masterless, cloud-scale, fault-tolerant distributed computation system for batch and stream processing written in Clojure.

#stream-processing#batch-processing#distributed

Stars2,051

Forks202

Last commit6 years ago

Pathway

A Python ETL framework for stream processing, real-time analytics, and building live LLM/RAG pipelines, powered by a scalable Rust engine.

#stream-processing#batch-processing#machine-learning-algorithms

A lightweight IoT data analytics and stream processing engine for resource-constrained edge devices.

#stream-processing#iot#rule-engine

An enterprise-grade event streaming platform that ingests, processes, and manages real-time event data with PostgreSQL compatibility and Apache Iceberg™ integration.

#stream-processing#etl-pipeline#database

Stars8,940

Forks756

Last commit23 hours ago

Library

8 projects

Apache Kafka Streams

A distributed event streaming platform for building high-performance data pipelines, streaming analytics, and data integration.

#stream-processing#message-queue#data-integration

A platform for building highly responsive, resilient, and scalable distributed systems using the actor model.

#stream-processing#distributed-actors#akka

Stars13,275

Forks3,556

Last commit2 days ago

Benthos

A high-performance, resilient stream processor that connects various sources and sinks, performs data transformations, and guarantees at-least-once delivery.

#stream-processing#declarative-config#cqrs

Stars8,644

Forks936

Last commit1 day ago

FS2(prev. 'Scalaz-Stream')

A purely functional, effectful, and polymorphic stream processing library for Scala built on Cats and Cats-Effect.

#stream-processing#functional-programming#scala-js

Stars2,446

Forks632

Last commit6 days ago

FastStream

An asynchronous Python framework for building services that interact with Apache Kafka, RabbitMQ, NATS, and Redis event streams.

#stream-processing#asyncio#redis

Stars5,123

Forks340

Last commit2 days ago

monix

A high-performance Scala library for composing asynchronous, event-based programs with strong functional programming influences.

#stream-processing#back-pressure#functional-programming

Stars1,935

Forks244

Last commit13 days ago

YoMo

An open-source LLM function calling framework for building scalable, low-latency AI agents with geo-distributed edge infrastructure.

#realtime#stream-processing#distributed-cloud

Cross-platform framework for building customizable on-device machine learning pipelines for live and streaming media.

#media-processing#video-processing#on-device-ml

Stars34,886

Forks5,928

Last commit1 day ago

Application

0 projects

IoT

0 projects

DSL

1 projects

summingbird

A library for writing MapReduce programs that execute on distributed platforms like Storm and Scalding using Scala/Java collection-like syntax.

#stream-processing#mapreduce#batch-processing

Stars2,127

Forks259

Last commit4 years ago

Related Awesome Lists

📦

Public Datasets

The "Awesome Public Datasets" project is a curated collection of publicly available datasets across various domains, including government, healthcare, finance, and social sciences. This list features datasets in multiple formats, along with links to tools and platforms that facilitate data analysis and visualization. It is an invaluable resource for researchers, data scientists, and students looking to access high-quality data for their projects or studies. By providing a wide array of datasets, this collection empowers users to explore, analyze, and derive insights from real-world data. Dive in to discover the wealth of information available for your next data-driven endeavor!

73.8k

📦

Big Data

The "Awesome Big Data" project is a curated collection of resources focused on big data technologies and practices that enable the processing and analysis of vast amounts of data. This list encompasses a variety of categories, including frameworks, tools, libraries, databases, and tutorials that cater to both beginners and experienced data professionals. Users can explore resources related to data storage, processing, analytics, and visualization, making it an invaluable asset for data scientists, engineers, and researchers. Whether you're looking to enhance your big data skills or find the right tools for your projects, this collection provides a comprehensive guide to navigating the big data landscape.

14.3k

📦

Data Engineering

The "Awesome Data Engineering" project is a curated collection of resources aimed at supporting professionals in the field of data engineering, which involves the design and construction of systems for collecting, storing, and analyzing data. This list encompasses a variety of categories, including data pipelines, ETL tools, data warehousing solutions, frameworks, and best practices, as well as tutorials and community resources. Whether you are a beginner looking to understand the fundamentals or an experienced engineer seeking advanced techniques, this list offers valuable insights and tools to enhance your data engineering projects. Dive into this collection to discover the tools and methodologies that can streamline your data workflows and improve your data management capabilities.

8.5k

📦

Network Analysis

The "Awesome Network Analysis" project is a curated collection of resources focused on the study and analysis of networks, which are structures made up of interconnected elements. This list encompasses a variety of tools, libraries, datasets, and tutorials that facilitate the exploration of network theory, graph analysis, and visualization techniques. It serves as a valuable resource for researchers, data scientists, and enthusiasts interested in understanding complex systems, social networks, and data relationships. Whether you are a beginner looking to grasp the basics or an experienced analyst seeking advanced methodologies, this collection provides essential tools and insights to enhance your network analysis projects.

4.0k

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub