A Storm spout that fetches data records from Amazon Kinesis and emits them as tuples for real-time stream processing.
Amazon Kinesis Storm Spout is a Java library that integrates Amazon Kinesis with Apache Storm, allowing developers to fetch data records from Kinesis streams and emit them as tuples within Storm topologies. It solves the problem of real-time stream processing by providing a reliable spout that manages checkpoint state in ZooKeeper and supports configurable retry logic for failed records.
Java developers building real-time stream processing applications with Apache Storm who need to ingest data from Amazon Kinesis streams.
Developers choose this spout for its seamless integration between Kinesis and Storm, built-in checkpoint management with ZooKeeper, and fault-tolerant design with configurable retry mechanisms, eliminating the need to build custom Kinesis consumers for Storm topologies.
Kinesis spout for Storm
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Stores checkpoint state in ZooKeeper to track stream positions, ensuring fault-tolerant processing and data continuity after failures, as highlighted in the README.
Supports customizable retry attempts for failed records with default retries and error logging, enhancing data reliability and reducing loss.
Buffers pending records in memory to allow re-emission without re-fetching from Kinesis, minimizing latency and network overhead.
Allows custom implementations of IKinesisRecordScheme to emit structured tuples, adapting to various data formats beyond the default scheme.
The README lists automatic handling of closed, split, and merged shards as 'Future Work,' requiring manual intervention like invoking 'storm rebalance' during resharding.
Requires multiple external libraries including AWS SDK, Commons Lang, Guava, and ZooKeeper, increasing deployment complexity and potential version conflicts.
Last release was in 2015, suggesting limited active development and potential incompatibility with newer Storm versions or AWS SDK updates.