A Clojure dataset manipulation library providing a dplyr-like API on top of tech.ml.dataset.
Tablecloth is a Clojure library that provides a high-level, dplyr-inspired API for dataset (data frame) manipulation on top of the tech.ml.dataset library. It simplifies common data transformation tasks like grouping, aggregation, and ordering by offering a unified, thread-first friendly interface.
Clojure developers and data scientists working with tabular data who want a more intuitive, R-like syntax for dataset manipulations without sacrificing the performance of tech.ml.dataset.
Developers choose Tablecloth for its clean, focused API that reduces boilerplate, its seamless integration with grouped operations, and its design that prioritizes ease of use while building on a fast, columnar data backend.
Dataset manipulation library built on the top of tech.ml.dataset
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Provides a single entry point for common operations like grouping and aggregation, reducing boilerplate by dispatching on arguments, as highlighted in the README's design goals.
Automatically handles grouped datasets, allowing functions to process both regular and grouped data without manual intervention, simplifying complex aggregations.
Functions are crafted to work seamlessly with Clojure's thread-first macro (->), enabling readable and sequential data transformation pipelines, as demonstrated in the usage example.
Draws inspiration from R's dplyr and tidyr, offering a familiar syntax for data scientists transitioning to Clojure, which lowers the learning curve for common tasks.
Heavily reliant on tech.ml.dataset, so any breaking changes or issues in the underlying library directly impact Tablecloth, necessitating version-specific branches and careful updates.
Focuses solely on dataset manipulation, excluding machine learning and other data science workflows mentioned in the README, requiring integration with additional libraries for full pipelines.
Contributing involves multiple code generation steps and specific tool requirements like Quarto CLI, as outlined in the README, which can hinder community involvement and quick fixes.