A unified framework for implementing and training deep learning models on tabular data using PyTorch and PyTorch Lightning.
PyTorch Tabular is a high-level deep learning framework specifically designed for tabular data. It provides a unified API to build, train, and deploy state-of-the-art neural network models for classification and regression tasks. The framework simplifies applying advanced techniques like probabilistic regression and semi-supervised learning by leveraging PyTorch Lightning for scalable training and automatic logging.
Data scientists and machine learning engineers working with structured, tabular datasets who want to experiment with or productionize deep learning models beyond traditional gradient boosting. It also targets researchers needing a flexible, customizable codebase for developing new tabular architectures.
Developers choose PyTorch Tabular for its comprehensive collection of modern deep learning models (like TabNet, NODE, and FT Transformer) accessible through a consistent, high-level interface. Its integration with PyTorch Lightning offers out-of-the-box scalability across hardware and training utilities, reducing boilerplate while maintaining customization flexibility for novel architectures.
A unified framework for Deep Learning Models on tabular data
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Provides a consistent interface for data, model, and training configuration using classes like DataConfig and TrainerConfig, simplifying the setup process as shown in the usage example.
Includes implementations of advanced models like TabNet, NODE, and FT Transformer, listed in the Available Models section, enabling easy experimentation with cutting-edge architectures.
Leverages PyTorch Lightning for distributed training, automatic logging, and efficient GPU/CPU utilization, making it straightforward to scale experiments across hardware.
Designed with low resistance for implementing new architectures, supported by a tutorial on how to implement new models, allowing seamless integration of custom designs.
Supports Mixture Density Networks for uncertainty-aware regression and Denoising AutoEncoders for semi-supervised learning, addressing advanced use cases directly in the framework.
The future roadmap admits that a scikit-learn compatible API is not yet implemented, limiting integration with traditional ML workflows and tools like GridSearchCV.
Requires familiarity with PyTorch and PyTorch Lightning, which can be a barrier for teams accustomed to simpler libraries like scikit-learn or XGBoost.
The data module may not efficiently handle datasets larger than RAM, as indicated by the planned migration to Polars or NVTabular in the roadmap, potentially slowing down preprocessing.
Compared to established frameworks like TensorFlow or scikit-learn, the community and third-party integrations are less mature, which might affect support, tutorials, and bug fixes.