Question 1

How do I ingest real-time data from Kafka into Druid?

Accepted Answer

Druid supports streaming ingestion via its native Kafka indexing service. Use the built-in web console wizard or configuration files to set up a supervisor that continuously pulls data from Kafka topics, enabling low-latency analytics. Detailed steps are in the ingestion documentation.

Question 2

Is Druid better than ClickHouse for real-time analytics?

Accepted Answer

Druid excels in high-concurrency, low-latency querying on time-series data with built-in streaming ingestion, while ClickHouse offers stronger single-node performance and full SQL support. Choose Druid for scalable, multi-user dashboards; ClickHouse for complex queries on larger datasets per node.

Question 3

What are the hardware requirements for running Druid in production?

Accepted Answer

Druid is resource-intensive, requiring substantial RAM and CPU for its JVM-based processes and distributed components. A minimum setup might involve multiple nodes with SSDs for fast I/O, but exact specs depend on data volume and query load—check the official tuning guides.

Question 4

Can I use SQL to query Druid, and how does it compare to native queries?

Accepted Answer

Yes, DruidSQL provides a SQL layer over its native query engine, suitable for most analytics. Native queries (JSON-based) offer finer control for performance-critical use cases. The README notes both are accessible via the web console or JDBC.

Question 5

How does Druid handle data retention and aging?

Accepted Answer

Druid uses segment-based storage with configurable retention rules; older data can be automatically dropped or tiered to cheaper storage. Management is done through the console or APIs, but it lacks built-in lifecycle policies like some cloud data warehouses.

Question 6

Is Druid suitable for IoT or monitoring time-series data?

Accepted Answer

Absolutely—Druid's column-oriented storage and real-time ingestion make it ideal for IoT streams and monitoring dashboards, providing fast aggregates and filters on time-stamped data. Many users deploy it for observability platforms.

Druid

What is Druid?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions