A Python library for defining portable, modular, and testable data transformation DAGs with built-in lineage and metadata.
Apache Hamilton is a Python library that helps data scientists and engineers define, manage, and execute data transformation workflows as directed acyclic graphs (DAGs). It structures code into modular, testable functions that automatically build a DAG, solving problems of code maintainability, collaboration, and portability across environments. The library includes built-in features for lineage tracking, metadata capture, and data validation.
Data scientists, data engineers, and ML engineers building ETL pipelines, ML workflows, LLM applications, or BI dashboards in Python who need scalable, maintainable data transformation code.
Developers choose Apache Hamilton for its unique combination of portability (runs anywhere Python does), modularity (encourages clean, testable code), and built-in observability (lineage and metadata tracking). Its function modifiers and separation of definition from execution reduce redundancy and ease the transition from development to production.
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
DAGs run anywhere Python does, from scripts to production orchestrators like Airflow and FastAPI, as emphasized in the portable DAGs feature.
DAGs are defined as regular Python functions, encouraging separation of concerns and unit testing, which is core to the library's philosophy.
Automatic lineage and metadata capture are included, with an optional UI for visualization and monitoring, enhancing debugging and collaboration.
Unique decorators like @config.when() and @check_output allow for DRY code and validation without redundancy, a key feature highlighted.
As an Apache incubating project, Hamilton may undergo breaking changes and lacks full ASF endorsement, which could risk production stability.
Full DAG visualization requires installing Graphviz separately and extra Python dependencies, adding complexity to initial configuration.
Compared to established tools like Airflow or dbt, Hamilton has a smaller community and plugin ecosystem, potentially limiting integration options.
Hamilton is an open-source alternative to the following products: