Showing 36 of 258 projects
RFC 4180 compliant, composable CSV parsing and encoding library for Elixir.
A library for parsing and querying XML data with Apache Spark SQL and DataFrames.
A collection of utility methods for Java 8 Streams, providing missing operations like takeWhile, zip, and unfold.
Kotlin bindings and extensions for Apache Spark, enabling idiomatic Kotlin development with data classes, lambdas, and null safety.
A command-line toolkit for efficient querying and manipulation of NCBI Taxonomy data, with support for custom taxonomies.
A high-performance JSON parser for Java 8 focused on speed and minimal memory footprint.
A high-performance Clojure library for JSON encoding and decoding, built on Jackson.
A visualization framework for Apache Pig workflows that combines graphical depictions with real-time execution information.
A dependency-free cross-platform Swiss Army knife for manipulating and editing Protein Data Bank (PDB) files.
A parallel gzip decompressor with fast random access, utilizing multi-core CPUs for high-speed decompression of standard gzip files.
A complete pipeline for processing two-photon calcium imaging data, including registration, ROI detection, signal extraction, and spike deconvolution.
An open-source toolkit for scalable, standardized computational pathology analysis, enabling AI and machine learning on large imaging datasets.
A library for writing Apache Spark applications in Haskell, enabling resilient analytics that scale to thousands of nodes.
A simple GUI application for editing ROS bag files by filtering topics, adjusting time ranges, and modifying transformations.
A parser and formatter for delimiter-separated values like CSV and TSV, based on RFC 4180.
A fast, fully-featured, and developer-friendly Clojure API for Apache Spark.
A high-performance, extensible CSV parsing library for .NET with support for synchronous and asynchronous reading.
A high-performance Go library for JSON unmarshalling that handles both known and unknown fields without data loss.
A high-performance SIMD CSV parser library and extensible CLI utility for tabular data processing.
A framework enabling spatial data analysis within Hadoop ecosystems using Hive and SparkSQL.
A fast, cross-platform, multi-threaded compression and decompression CLI tool written in Rust.
An open-source Python and CLI tool for reading OpenStreetMap PBF files using DuckDB and exporting to GeoParquet.
A framework for orchestrating forensic collection, processing, and data export through modular recipes.
A lazy functional iteration library supporting sync, async, and concurrent iteration in JavaScript.
A Node.js library to automatically scrape and extract readable article content from any web page, supporting both English and Chinese.
Real-time visualization and processing tool for live 3D LiDAR data from Velodyne sensors.
A fast, promise-based CSV parser for Node.js that wraps csv-parser for convenient usage.
A serverless proxy for Spark clusters that provides a functional programming framework and deployment model for Spark applications.
A zero-allocation, high-performance parser for fixed-length and variable-length files in .NET, using expression trees and Span.
A bi-directional connector enabling Apache Spark to read from and write to Neo4j graph databases using Spark DataSource APIs.
A Rust library for reading and writing ZIP files with support for multiple compression formats and encryption.
ASP.NET bindings and automatic parsing for jQuery DataTables with extension methods for data queries.
Real-time reception, recording, visualization, and processing of 3D LiDAR data from multiple manufacturers.
A lightweight real-time big data streaming engine built on Akka for high-throughput, low-latency data processing.
An idiomatic Clojure dataframe library that runs on Apache Spark, providing a seamless interface for data processing and machine learning.
A fast, standard-compliant SAX parser and encoder for XML in Elixir, supporting streaming and simple DOM export.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.