A method for selecting interpretable feature subsets from complex models using mutual information optimization.
L2X is a model interpretation method that selects the most informative feature subsets to explain predictions from complex machine learning models. It formulates feature selection as an optimization problem that maximizes mutual information between selected features and model outputs. The approach provides instance-specific explanations while maintaining the original model's predictive accuracy.
Machine learning researchers and practitioners working on explainable AI who need to interpret black-box model predictions, particularly those using deep learning models where traditional interpretation methods are insufficient.
L2X offers an information-theoretic foundation for model interpretation that provides mathematically grounded feature importance scores. Unlike gradient-based methods, it selects discrete feature subsets and works with any black-box model without requiring access to gradients or internal parameters.
L2X (Learning to Explain) is a model interpretation framework that identifies the most informative features for a model's predictions through an information-theoretic approach. It provides instance-wise feature selection to explain black-box models while maintaining predictive performance.
L2X approaches model interpretation as an information-theoretic optimization problem, selecting minimal feature subsets that preserve maximal information about model predictions.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses mutual information optimization to select feature subsets, providing a mathematically grounded foundation for explanations, as detailed in the ICML 2018 paper.
Delivers explanations tailored to individual predictions, enhancing interpretability for complex models by focusing on per-instance feature relevance.
Works with various black-box models without requiring access to internal parameters, making it versatile for different machine learning applications.
Selects minimal feature subsets that retain maximal information about predictions, ensuring explanations don't compromise model performance, as emphasized in the key features.
Requires TensorFlow 1.2.1 and Keras 2.0, which are legacy versions that may conflict with modern libraries and lack support for recent deep learning advancements.
The README provides minimal setup instructions and lacks detailed tutorials or API references, making it challenging for newcomers to implement beyond replication.
The information-theoretic optimization process can be resource-intensive, slowing down explanation generation for large datasets or complex models.
As a research project from 2018, it lacks regular updates, bug fixes, and community-driven extensions, reducing long-term viability for evolving needs.