A Java-based tool for importing tabular data from JDBC sources into Elasticsearch for indexing.
Elasticsearch JDBC Importer is a Java application that fetches data from relational databases via JDBC and indexes it into Elasticsearch. It transforms tabular data into structured JSON documents, enabling efficient search and analysis of database content. It is designed for simplicity and efficiency in streaming tabular data into Elasticsearch.
Developers and data engineers who need to synchronize relational database content (like MySQL, PostgreSQL, or SQL Server) with Elasticsearch for search and analytics. It is particularly useful for those handling simple tabular data streams rather than deeply nested objects.
Developers choose this tool for its straightforward, configuration-driven approach to JDBC-based data import, with built-in support for scheduled incremental syncs, bulk indexing with throttling, and flexible SQL execution including stored procedures. Its extensible architecture allows custom strategies for source, sink, and context handling.
JDBC importer for Elasticsearch
Supports parameterized SQL, stored procedure calls, and write-back operations for data acknowledgment, as detailed in the 'sql' parameter documentation, allowing complex database interactions.
Enables cron-based scheduling and stateful incremental fetching using timestamps via the 'statefile' parameter, efficiently syncing only new or updated records without full table scans.
Utilizes Elasticsearch's bulk API with configurable limits on request size, concurrency, and volume (e.g., 'max_bulk_actions'), optimizing performance for large data transfers.
Designed with pluggable source, sink, and context interfaces, allowing custom strategies for advanced use cases, as mentioned in the 'Developer notes' section.
The compatibility matrix shows last support for Elasticsearch 2.3.4 in 2016, with no updates for newer versions, making it risky for modern deployments without forks or patches.
Requires manual download and placement of JDBC drivers in the lib folder, plus JSON configuration via command-line, which is less user-friendly compared to integrated tools with GUIs or simpler configs.
The README admits it's 'limited in the way to reconstruct deeply nested objects' from many joins, focusing on simple tabular data, which can be a barrier for complex database schemas.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.