An open-source unit test framework for Hive SQL queries, enabling TDD without installed dependencies via JUnit 4 and 5.
HiveRunner is an open-source unit test framework specifically designed for Apache Hive SQL queries. It allows developers to write, test, and validate Hive scripts in isolation without needing a full Hive installation, integrating directly with JUnit for familiar testing workflows. It solves the problem of untested, hard-to-maintain Hive SQL by enabling Test-Driven Development (TDD) for data pipelines.
Data engineers, ETL developers, and data platform teams who build and maintain Hive-based data pipelines and need to ensure SQL code quality and reliability.
Developers choose HiveRunner because it eliminates dependency headaches, provides a fast feedback loop for Hive SQL changes, and enforces modular, testable code patterns—reducing bugs in production data workflows.
An Open Source unit test framework for Hive queries based on JUnit 4 and 5
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Only needs a Maven dependency with no external Hive or Hadoop installation, enabling quick setup as highlighted in the README's key features.
Seamlessly works with JUnit 4 and 5, allowing developers to use existing testing workflows and CI/CD tools.
Provides HiveShell for programmatic test data insertion and annotations like @HiveResource, supporting formats like ORC and sequence files.
Supports multiple Hive execution engines (MapReduce, Tez) and command shell emulations (CLI, Beeline), offering versatile testing environments.
Deliberately disallows Hive's 'add jar' statement, forcing teams to separate environment-specific code, which can hinder workflows relying on integrated UDFs.
Due to Hadoop's limited Windows support, running on Windows often requires Cygwin or workarounds, as noted in the limitations section.
Spins up and tears down a HiveServer for each test method, leading to slower execution times for large test suites.
Requires non-trivial Surefire plugin setup to avoid OOM errors and manage timeouts, adding initial setup complexity beyond a simple dependency.