A Java dataframe and visualization library for data loading, cleaning, transformation, and analysis.
Tablesaw is a Java dataframe and visualization library that provides tools for loading, cleaning, transforming, filtering, and summarizing data. It enables data manipulation and analysis within the JVM, similar to pandas in Python, and includes visualization capabilities via Plot.ly integration. It helps Java developers perform data science tasks without leaving their preferred ecosystem.
Java developers and data scientists working within the JVM ecosystem who need to perform data manipulation, analysis, and visualization directly in Java applications or Jupyter notebooks.
Tablesaw offers a native Java solution for dataframe operations and visualization, eliminating the need to switch to Python or R for data tasks. Its integration with Jupyter notebooks and support for various data sources and machine learning libraries make it a versatile tool for data workflows in Java.
Java dataframe and visualization library
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Tablesaw can load data from diverse sources including RDBMS, Excel, CSV, JSON, HTML, and remote locations like HTTP and S3, as detailed in the README's data import/export features.
It works seamlessly with BeakerX and IJava for interactive data exploration in Jupyter notebooks, enabling a familiar workflow for data scientists within Java environments.
The library provides a wrapper for Plot.ly, supporting various plot types like scatter plots, histograms, and heatmaps, with examples shown in the README's visualization section.
Includes descriptive statistics such as mean, median, standard deviation, skewness, and kurtosis, making it suitable for exploratory data analysis directly in Java.
Features like Excel support, JSON handling, and visualization require separate dependencies (e.g., tablesaw-excel, tablesaw-jsplot), complicating setup and increasing project size.
Visualization relies on Plot.ly JavaScript library, which may not work well in headless or non-browser environments and adds an external dependency outside the Java ecosystem.
Compared to Python's pandas, Tablesaw has fewer third-party extensions and integrations, as evidenced by Parquet support being an external project not maintained by the core team.
Tablesaw is an open-source alternative to the following products: