A Python library for evaluating binary classifiers in machine learning ensembles using Shapley value computation and approximation methods.
Shapley is a Python library for evaluating binary classifiers within machine learning ensembles by computing their Shapley values in weighted voting games. It quantifies each model's contribution to the ensemble's performance and assesses player pool heterogeneity through Shapley entropy. The library implements rigorous methods from economics and computer science to provide a robust framework for classifier evaluation.
Machine learning researchers and data scientists working with ensemble models who need to analyze individual classifier contributions and ensemble diversity. It is particularly suited for those implementing or studying cooperative game theory applications in ML.
Developers choose Shapley for its research-backed implementation of multiple Shapley value computation methods, including exact and approximation techniques, in a single, well-documented library. Its unique focus on weighted voting games and Shapley entropy for heterogeneity analysis sets it apart from general-purpose Shapley value tools.
The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Implements exact enumeration and several approximation techniques like Monte Carlo Permutation Sampling, allowing users to balance accuracy and efficiency based on ensemble size, as detailed in the methods section.
Based on peer-reviewed methods from economics and computer science, ensuring theoretical soundness, with a cited research paper providing validation for the approaches.
Includes detailed tutorials, full test coverage, and illustrative toy examples, making it accessible for users to quickly apply the library to their problems.
Provides Shapley entropy to measure classifier pool diversity, a feature not commonly found in other tools, adding value for in-depth ensemble analysis as highlighted in the key features.
Only supports binary classification tasks, making it unsuitable for multi-class or regression models, which restricts its use in broader machine learning workflows.
Methods like Exact Enumeration can be slow for large ensembles, and approximations require careful tuning, with no built-in support for parallel processing mentioned in the README.
Requires understanding of cooperative game theory concepts like Shapley values and weighted voting games, posing a learning curve for practitioners without that background.