A Go library that generates type-safe Parquet readers and writers from Go structs or existing Parquet files.
Parquet is a Go library that generates type-safe Parquet file readers and writers from Go struct definitions. It solves the problem of manual Parquet serialization/deserialization by automatically creating efficient, correct code for reading and writing Parquet files based on your data structures.
Go developers who need to work with Parquet files for data processing, analytics pipelines, or big data applications and want type-safe interfaces without manual implementation.
Developers choose Parquet because it provides automatic, type-safe code generation that eliminates boilerplate and reduces errors compared to manual Parquet implementations, while maintaining compatibility with standard Parquet formats.
A library for reading and writing parquet files.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Generates ParquetWriter and ParquetReader implementations directly from Go structs, eliminating manual serialization and reducing boilerplate code for Parquet file operations.
Leverages Go's type system to ensure data integrity when reading and writing, minimizing runtime errors and enhancing developer confidence in data handling.
Can generate Go structs and corresponding code from existing Parquet files, aiding in working with legacy data formats, though with noted limitations on file compatibility.
Supports optional fields via pointers, embedded structs, nested/repeated structures, and field exclusion through tags or unexported fields, offering versatile data modeling.
Only supports DATA_PAGE page types and specific codecs like PLAIN or SNAPPY, excluding many common Parquet encodings and options, as admitted in the README, making it unsuitable for complex files.
Requires running go generate for schema changes, which can slow iterative development and isn't suitable for dynamic or runtime-defined data structures.
Lacks support for more complex Parquet features such as various compression algorithms beyond Snappy and unsupported encodings, limiting its use in high-performance or diverse data pipelines.