A Python library that brings R's dplyr data manipulation syntax to pandas DataFrames using a pipe operator.
Dplython is a Python library that brings R's dplyr data manipulation syntax and philosophy to pandas DataFrames. It provides a set of intuitive, chainable functions for selecting, filtering, transforming, and summarizing data using a pipe operator (`>>`), making data analysis workflows more readable and expressive.
Data scientists, analysts, and Python developers who are familiar with R's dplyr or want a more expressive, functional approach to data manipulation in pandas.
Developers choose Dplython for its familiar dplyr-like syntax, clean pipeline operations, and seamless integration with pandas, offering a more intuitive alternative to pandas' native methods for complex data transformations.
dplyr for python
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Provides familiar verbs like select, mutate, and group_by, making it easy for R users to transition to Python while thinking in dplyr terms, as shown in the example workflows.
Uses the >> operator to chain data transformations, creating clean and expressive code pipelines that enhance readability for complex manipulations, as demonstrated in the multi-step examples.
Enables easy column selection with X.column_name or X['column name'] syntax, handling non-standard column names seamlessly, as illustrated in the README for columns with spaces.
Supports the @DelayFunction decorator to apply custom functions or visualization libraries like ggplot and matplotlib directly within pipelines, allowing for advanced extensibility.
At version 0.0.7, the project is explicitly marked as experimental and subject to change, posing a risk for breaking updates and lack of production readiness.
Being a niche wrapper on pandas, it has fewer users and less active development compared to mainstream libraries, which can lead to slower bug fixes and limited documentation.
Relies on the non-standard >> pipe operator, which might confuse developers unfamiliar with dplyr or cause integration issues with other Python libraries that use similar operators.