A distributed framework extending Apache Spark with unified SQL access to multiple datastores, optimized connectors, and streaming support.
Crossdata is a distributed framework that extends Apache Spark's capabilities with a unified SQL-like interface for accessing multiple datastore technologies. It solves the problem of fragmented data access by providing optimized connectors, streaming support, and enhanced SQL features across diverse data sources like Cassandra, MongoDB, and ElasticSearch.
Data engineers and analysts who need to query multiple datastores through a single interface, particularly those working with Apache Spark ecosystems and requiring integration with BI tools via JDBC/ODBC.
Developers choose Crossdata for its ability to overcome Spark's limitations with native datasource access, unified SQL across technologies, and self-contained JDBC/ODBC support without Hive dependencies, making big data analytics more efficient and accessible.
DISCONTINUED - Easy access to big things. Library for Apache Spark extending and improving its capabilities
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Provides a single SQL-like language to query multiple datastores like Cassandra, MongoDB, and ElasticSearch, simplifying analytics across diverse technologies as highlighted in the features.
Uses native access to datasources to speed up queries by avoiding Spark cluster overhead, evidenced by connectors for Cassandra, MongoDB, and ElasticSearch that reduce resource blocks.
Enables batch and streaming processing from the same SQL interface, allowing mixing of data from different input technologies as stated in the introduction.
Offers connectivity for BI tools without requiring Hive, making integration easier compared to other solutions that depend on Hive, as noted in the advantages.
The project has been discontinued and moved to a commercial license, meaning no further open-source updates or community-driven development, as stated clearly at the top of the README.
Only supports up to Apache Spark 1.6.X, which is outdated and incompatible with newer Spark releases, as shown in the compatibility table with no versions beyond Spark 1.6.
Involves multiple components like Core, Server, Driver, and Connectors, which can increase deployment and maintenance complexity, as described in the components breakdown.