Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Frontend GIS
  3. geoparquet

geoparquet

Apache-2.0Pythonv1.1.0+p1

An open specification for storing geospatial vector data (points, lines, polygons) in the Apache Parquet columnar storage format.

Visit WebsiteGitHubGitHub
1.0k stars67 forks0 contributors

What is geoparquet?

GeoParquet is a community-driven specification that defines how to store geospatial vector data (points, lines, polygons) within Apache Parquet files. It standardizes geospatial data representation in Parquet to enhance interoperability across tools and advance cloud-native geospatial workflows. The specification provides a stable foundation for efficient geospatial analytics within modern, columnar data ecosystems.

Target Audience

Data engineers, data scientists, and geospatial analysts working with cloud data warehouses (like BigQuery, Snowflake, Redshift) or columnar data processing frameworks who need to store and analyze geospatial vector data efficiently. It is also for developers building geospatial tools and libraries that require interoperable geospatial data storage.

Value Proposition

Developers choose GeoParquet because it brings geospatial best practices to the widely adopted Parquet format, enabling high-performance, read-heavy analytic workflows with efficient compression and columnar storage. Its unique selling point is providing a standardized, interoperable specification that is supported by over 20 tools across 6 languages, fostering innovation in cloud-native and streaming vector workflows.

Overview

Specification for storing geospatial vector data (point, line, polygon) in Parquet

Use Cases

Best For

  • Storing geospatial vector data for read-heavy analytic scenarios in cloud data warehouses.
  • Enabling interoperability among different geospatial tools and libraries using Apache Parquet.
  • Persisting geospatial data from Apache Arrow for cross-language in-memory analytics.
  • Optimizing geospatial workflows with efficient compression and columnar storage to reduce disk space and network transfer costs.
  • Partitioning large geospatial datasets across multiple files for improved processing efficiency.
  • Supporting both planar and spherical coordinate systems in cloud-native geospatial applications.

Not Ideal For

  • Systems requiring frequent, small writes or real-time data updates (e.g., live GPS tracking apps)
  • Tools focused solely on geospatial visualization without analytical processing needs
  • Legacy GIS workflows heavily dependent on established formats like Shapefiles for compatibility

Pros & Cons

Pros

Efficient Compression

Leverages Parquet's columnar design to achieve high compression ratios, significantly reducing disk space and network transfer costs as highlighted in the README.

Broad Tool Interoperability

Standardizes geospatial data exchange with support from over 20 tools across 6 languages, enhancing compatibility across cloud data warehouses and libraries.

Optimized for Analytic Workloads

Enables cheap column reads and efficient filtering via statistics, making it ideal for read-heavy scenarios in modern data ecosystems, per the README's feature list.

Flexible Coordinate Support

Supports both planar and spherical coordinate systems, aligning with major cloud platforms like BigQuery and Snowflake for seamless integration.

Cons

Write Performance Trade-off

The specification explicitly notes that row-based formats are better for constant data updates, making GeoParquet unsuitable for write-heavy systems like transactional databases.

Implementation Variability

As a community-driven spec, actual features and performance depend on individual tool implementations, which can lead to inconsistencies or gaps in support.

Geospatial Complexity Overhead

Requires teams to integrate geospatial concepts with Parquet's columnar storage, adding a learning curve for those new to either domain.

Frequently Asked Questions

Quick Stats

Stars1,045
Forks67
Contributors0
Open Issues38
Last commit14 days ago
CreatedSince 2021

Tags

#geospatial#gis#columnar-storage#interoperability#geoparquet#data-specification#big-data#apache-parquet#data-format#vector-data#cloud-native

Links & Resources

Website

Included in

Frontend GIS675
Auto-fetched 1 day ago

Related Projects

Turf.jsTurf.js

A modular geospatial engine written in JavaScript and TypeScript

Stars10,380
Forks1,006
Last commit7 days ago
topoJSONtopoJSON

An extension of GeoJSON that encodes topology! 🌐

Stars4,901
Forks684
Last commit1 year ago
geolibgeolib

Zero dependency library to provide some basic geo functions

Stars4,274
Forks338
Last commit2 months ago
rbushrbush

RBush — a high-performance JavaScript R-tree-based 2D spatial index for points and rectangles

Stars2,750
Forks253
Last commit5 days ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub