An automated feature generation framework for tabular data that discovers expert-level features to boost machine learning model performance.
OpenFE is an automated feature generation framework for tabular data that systematically creates new candidate features to improve machine learning model performance. It supports various tasks like classification and regression, and is designed to be efficient and easy to use, often outperforming human experts in feature engineering.
Data scientists, machine learning engineers, and researchers working with tabular data who need to enhance model performance through automated feature engineering, particularly in competitive settings like Kaggle.
Developers choose OpenFE for its ability to generate expert-level features automatically, its efficiency with parallel computing, and its proven track record of outperforming existing methods and human experts in real-world competitions.
OpenFE: automated feature generation with expert-level performance
Covers 23 useful and effective operators for generating diverse candidate features, as specified in the README, enabling comprehensive feature exploration.
Automatically processes missing values and categorical features during feature generation, reducing manual preprocessing effort for tabular data.
Validated on Kaggle, such as beating 99.3% of teams in IEEE-CIS Fraud Detection, demonstrating expert-level feature engineering capabilities.
Supports parallel processing with n_jobs parameter for faster execution on large datasets, enhancing scalability.
The README warns that conda install may install a different package, requiring careful pip setup and potentially causing confusion for users.
Designed specifically for tabular data, limiting applicability to other data types like images or text without significant preprocessing.
Despite parallel support, feature generation can be resource-heavy, demanding substantial CPU and memory for large datasets, which may not suit all environments.
Automatic extraction of relevant features from time series:
An open source python library for automated feature engineering
Feature engineering and selection open-source Python library compatible with sklearn.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.