A fault-tolerant service that persists Kafka log data to cloud storage like S3, GCS, Azure Blob Storage, and OpenStack Swift.
Secor is a distributed service that reliably persists Kafka log data to cloud object storage systems like Amazon S3, Google Cloud Storage, Microsoft Azure Blob Storage, and OpenStack Swift. It ensures strong consistency and fault tolerance, enabling scalable log archiving and seamless integration with analytics systems such as Hive.
Data engineers and platform teams managing Kafka-based data pipelines who need durable, consistent storage of log data in cloud object storage for analytics and compliance.
Developers choose Secor for its strong consistency guarantees, ensuring each Kafka message is saved exactly once despite eventual consistency models in cloud storage, and for its horizontal scalability and fault tolerance in distributed environments.
Secor is a service implementing Kafka log persistence
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Guarantees each Kafka message is persisted exactly once to cloud storage, overcoming eventual consistency models like S3's, as highlighted in the strong consistency feature of the README.
Components can crash at any point without compromising data integrity, ensuring reliability in distributed, failure-prone environments, which is a core feature listed.
Scales easily by adding or removing Secor processes with no impact on data consistency, supporting load distribution across multiple machines for handling increased Kafka loads.
Automatically partitions output data by day, hour, or minute for direct import into systems like Hive, facilitating efficient analytics workflows as described in the output partitioning feature.
Exposes metrics via Ostrich and Micrometer with export options to OpenTSDB and statsD, enabling comprehensive performance tracking, as noted in the monitoring section of the README.
Requires deployment and management of a distributed service with detailed configuration guides for setup and Kubernetes, adding significant overhead compared to simpler log-sinking tools.
While customizable via external classes, Secor lacks out-of-the-box support for many data formats, forcing teams to implement custom parsers for non-standard log messages, which can increase development time.
Tightly coupled with Apache Kafka, making it unsuitable for projects using other message brokers or streaming platforms, limiting its applicability in heterogeneous environments.
The strong consistency model and configurable upload policies based on size or time may introduce buffering delays, which could be a trade-off for use cases requiring low-latency data availability.