A Python library implementing fairness-aware machine learning algorithms for measuring and mitigating discrimination in predictive models.
Themis ML is a Python library that implements fairness-aware machine learning algorithms to measure and mitigate discrimination in predictive models. It helps ensure algorithms treat social groups fairly with respect to important outcomes like loan approvals or hiring decisions by providing tools for discrimination discovery and mitigation across preprocessing, model estimation, and postprocessing stages.
Data scientists, machine learning engineers, and researchers working on ethical AI who need to audit and improve the fairness of their predictive models, particularly in domains with socioeconomic or legal implications.
Developers choose Themis ML because it offers a comprehensive, scikit-learn-compatible toolkit specifically focused on fairness, with implemented algorithms for measuring discrimination and multiple mitigation techniques, all built on familiar Python data science stacks.
A library that implements fairness-aware machine learning algorithms
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Seamlessly integrates with existing scikit-learn workflows, as it's built on top of sklearn, allowing easy adoption in standard ML pipelines.
Implements techniques across preprocessing (e.g., relabelling), model estimation (e.g., additive counterfactually fair estimator), and postprocessing (e.g., reject option classification) for comprehensive bias mitigation.
Includes utility functions to load commonly used fairness datasets like German Credit and Census Income, facilitating quick experimentation and benchmarking.
Provides references to academic papers and comprehensive documentation on Read the Docs, ensuring users can understand the underlying fairness concepts.
The README's checklist shows several promised features, such as reweighting, sampling, and prejudice remover regularized estimator, are not yet implemented, limiting functionality.
Out of seven listed datasets, only German Credit and Census Income are currently supported, restricting broader benchmarking and experimentation options.
Relies on users understanding complex fairness definitions (e.g., statistical parity), which may require additional expertise in ethical AI beyond standard ML knowledge.