Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. ArcGIS Developer
  3. spatial-framework-for-hadoop

spatial-framework-for-hadoop

Apache-2.0Javav2.2.0

A framework enabling spatial data analysis within Hadoop ecosystems using Hive and SparkSQL.

GitHubGitHub
376 stars158 forks0 contributors

What is spatial-framework-for-hadoop?

Spatial Framework for Hadoop is an open-source library that enables spatial data analysis within Hadoop ecosystems. It provides User-Defined Functions and serialization tools for Hive and SparkSQL, allowing users to process and query geographic data at scale. The framework solves the problem of integrating geospatial analytics into big data workflows without requiring specialized standalone systems.

Target Audience

Data engineers and data scientists working with large-scale spatial data in Hadoop environments, particularly those using Hive or SparkSQL for analytics. It's also relevant for organizations with existing Esri/ArcGIS infrastructure looking to extend spatial capabilities to big data platforms.

Value Proposition

Developers choose this framework because it provides native spatial functions within familiar Hadoop tools, avoiding the need for separate geospatial processing systems. Its tight integration with Esri's geometry standards ensures compatibility with ArcGIS workflows while leveraging Hadoop's distributed processing power.

Overview

The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data processing system for spatial data analysis.

Use Cases

Best For

  • Performing spatial joins and aggregations on large geographic datasets in Hadoop
  • Integrating ArcGIS-generated JSON data into Hive or SparkSQL pipelines
  • Adding spatial analysis capabilities to existing Hive-based data warehouses
  • Processing satellite imagery or sensor data with geographic coordinates at scale
  • Building location-aware analytics applications on Hadoop clusters
  • Extending Esri/ArcGIS workflows to big data environments without vendor lock-in

Not Ideal For

  • Teams using non-Hadoop data platforms like cloud data warehouses (e.g., BigQuery, Snowflake) for spatial analysis
  • Projects not integrated with Esri/ArcGIS tools, as the JSON utilities are optimized for ArcGIS exports
  • Environments requiring seamless Maven Central dependency management without manual builds
  • Real-time geospatial processing applications, since Hadoop is batch-oriented and the framework focuses on Hive/SparkSQL queries

Pros & Cons

Pros

ArcGIS Ecosystem Integration

The JSON utilities specifically handle JSON exported from ArcGIS, making it easy to incorporate Esri data into Hadoop workflows without reformatting.

Spatial UDFs for Hive/Spark

Provides User-Defined Functions and SerDes for spatial analysis directly in Hive and SparkSQL, enabling native geospatial queries like ST_Intersects or ST_Buffer on large datasets.

Robust Geometry Library

Built on the Esri Geometry API for Java, ensuring accurate spatial operations and calculations that adhere to enterprise standards, as seen in the ST_Centroid fix in v2.1.

Broad Version Compatibility

Supports Hive v1+ and SparkSQL, with ongoing updates like Hive v4 compatibility, allowing use across various Hadoop distributions.

Cons

Limited Maven Availability

Pre-built releases may not be on Maven Central, requiring manual builds or dependency management, as noted in the README issue #123.

Legacy Build Tool Support

Ant build files are available but marked as legacy and likely to be removed, forcing users to migrate to Maven for future updates.

Complex Custom Deployment

Workflows requiring MapReduce jobs need custom job authoring and deployment, adding overhead compared to drop-in solutions.

Frequently Asked Questions

Quick Stats

Stars376
Forks158
Contributors0
Open Issues22
Last commit2 months ago
CreatedSince 2013

Tags

#geospatial#java#gis#hive#data-management#big-data#data-processing#spatial-analysis#hadoop#sparksql

Built With

H
Hadoop
A
Apache Ant
M
Maven
H
Hive
J
Java

Included in

ArcGIS Developer314
Auto-fetched 4 hours ago

Related Projects

Turf.jsTurf.js

A modular geospatial engine written in JavaScript and TypeScript

Stars10,360
Forks1,004
Last commit2 days ago
gis-tools-for-hadoopgis-tools-for-hadoop

The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.

Stars522
Forks251
Last commit4 years ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub