Showing 18 of 18 projects
A practical guide for researchers on how to properly structure and share data with statisticians to ensure efficient analysis.
A grammar of data manipulation for R, providing a consistent set of verbs to solve common data manipulation challenges.
A curated collection of Python libraries, tutorials, and tools for data science, from data wrangling to machine learning and visualization.
A blazing-fast command-line toolkit for querying, slicing, analyzing, transforming, and validating tabular data (CSV, Excel, JSONL, etc.).
A Go library providing DataFrames, Series, and data wrangling operations for tabular data manipulation.
A Go library providing DataFrames, Series, and data wrangling operations for structured data manipulation.
A command-line tool that provides jq-style access to structured data sources like SQL databases, CSV, and Excel files.
A flexible and fast package for in-memory tabular data manipulation and analysis in the Julia programming language.
A Python library for agile data preparation workflows that works with Pandas, Dask, cuDF, Dask-cuDF, Vaex, and PySpark.
Python library providing clean, chainable functions for data cleaning and manipulation with pandas DataFrames.
An R package for reshaping and tidying data into a consistent format for easier analysis.
A sharp cut(1) clone with regex delimiters, column reordering, and automatic decompression for data exploration.
An R package for joining data frames on inexact matching using string distance, regex, numeric tolerance, and other fuzzy criteria.
A VS Code extension for visually exploring, cleaning, and transforming tabular data with automatic Pandas code generation.
A high-performance data profiler for discovering and validating complex patterns like functional dependencies, inclusion dependencies, and association rules.
A high-performance data profiler for discovering and validating complex patterns in datasets, enabling data cleaning and quality analysis.
An idiomatic Clojure dataframe library that runs on Apache Spark, providing a seamless interface for data processing and machine learning.
A command-line tool for querying and transforming JSON/NDJSON documents using the GROQ query language.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.