A command-line tool for validating, cleaning, and minimizing GTFS transit feed files while preserving semantic equivalence.
gtfstidy is a command-line tool designed to validate, sanitize, and minimize GTFS (General Transit Feed Specification) feeds. It fixes inconsistencies, removes redundant data, and optimizes file sizes while ensuring the output feed remains semantically equivalent to the original, meaning all trips and passenger information are preserved.
Transit agency data managers, open data publishers, and developers working with GTFS feeds who need to ensure compliance, reduce storage/bandwidth costs, and prepare feeds for distribution or further processing.
It offers a comprehensive suite of processors for GTFS optimization, from validation and error correction to advanced minimization algorithms, all while guaranteeing semantic integrity—a key requirement for reliable transit applications.
A tool for checking, sanitizing and minimizing GTFS feeds.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Performs extensive checks including stop time progressions, ID references, and field value ranges, ensuring GTFS standard compliance as detailed in the validation features.
Uses algorithms like Douglas-Peucker for shape minimization and frequency covers for trips, reducing feed sizes by up to 30-40% as shown in the evaluation tables with real-world feeds.
Maintains exact passenger trip semantics while optimizing, ensuring output feeds are equivalent to input from a rider's perspective, which is a core philosophy stated in the README.
Offers options to set erroneous fields to defaults or drop non-fixable entries, providing flexibility in handling dirty data without breaking the entire feed.
Requires understanding and enabling specific processors via command-line flags, with a fixed order of execution, which can be overwhelming for users unfamiliar with GTFS internals or optimization algorithms.
ID minimization processors change IDs, making it unsuitable for feeds referenced by GTFS-realtime or other external systems, as explicitly cautioned in the README's ID minimizer section.
Lacks a GUI or web interface, limiting accessibility for non-technical users or those preferring visual tools, and relies on Go installation for setup.