A pandas-based Python library for calculating weighted statistics like means, medians, standard deviations, and distributions.
weightedcalcs is a Python library built on pandas that provides tools for calculating weighted statistical measures like means, medians, standard deviations, and distributions. It solves the problem of accurately analyzing datasets where observations have different weights, such as survey responses or census data.
Data scientists, researchers, and analysts working with weighted datasets in Python, particularly those using pandas for data manipulation and statistical analysis.
Developers choose weightedcalcs for its seamless pandas integration, clean API, and built-in data integrity checks, making it a reliable and straightforward solution for weighted calculations compared to manual implementations or less integrated alternatives.
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Directly works with pandas DataFrames and DataFrameGroupBy objects, enabling easy incorporation into existing data analysis workflows, as demonstrated in the ACS data example.
Raises errors when data contains null values, preventing inaccurate weighted calculations and ensuring reliability, a feature explicitly highlighted in the README.
Uces a simple Calculator class with methods like mean() and distribution(), offering a straightforward interface for common weighted stats without complex setup.
Supports essential weighted statistics including means, medians, quantiles, standard deviations, and distributions, covering most needs for survey and census data analysis.
Only includes basic weighted calculations; lacks advanced functions like weighted variance or regression, which may require supplementing with other libraries as noted in the 'Other libraries' section.
Built entirely on pandas, inheriting its memory usage and performance limitations, making it less suitable for very large datasets or real-time processing compared to lightweight alternatives.
While it handles nulls, the README provides minimal guidance on issues like negative weights or non-numeric data, requiring users to implement manual preprocessing.