An R package for flexibly rearranging, reshaping, and aggregating data, now superseded by tidyr.
reshape2 is an R package that provides tools for flexibly rearranging, reshaping, and aggregating data, primarily converting data between wide and long formats. It solves the problem of restructuring datasets to facilitate analysis, visualization, and modeling in R. The package is a reboot of the original reshape, offering significant performance improvements through optimized algorithms.
Data scientists, statisticians, and R programmers who need to manipulate and reshape datasets for analysis, especially those working with tidy data principles or preparing data for visualization.
Developers choose reshape2 for its speed and efficiency, with operations up to 100x faster than the original reshape, and its streamlined syntax that simplifies complex data transformations. However, it is now superseded by tidyr, which is recommended for modern workflows.
An R package to flexible rearrange, reshape and aggregate data
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Melt operations are up to 10x faster and cast operations up to 100x faster than the original reshape, due to efficient subsetting algorithms that minimize data copying.
Replaces the generic cast with dcast for data frames and acast for matrices/arrays, offering clear type-specific output control for different data structures.
Improves usability with better column naming (Var1, Var2, etc.) and clearer argument names like variable.name, making data melting more intuitive.
Supports complex aggregations with margins that refer to variables set to (all), replacing older grand_row/grand_col for more flexible summary computations.
Officially marked as superseded by tidyr, meaning it receives only maintenance updates and is not recommended for new projects, limiting future feature development.
Removes key features from the original reshape, such as the | cast operator and the ability to return multiple values from aggregation functions, which may hinder complex workflows.
As a legacy package, it lacks seamless integration with newer tidyverse tools like dplyr and tidyr, making it less ideal for contemporary data science pipelines.