A PyTorch implementation of TResNet, a high-performance convolutional neural network architecture optimized for GPU training and inference.
TResNet is a family of convolutional neural network architectures designed to deliver superior accuracy while maintaining high GPU training and inference throughput. It addresses the limitations of FLOPs-optimized models by introducing modifications that better utilize GPU structure, offering a better speed-accuracy trade-off than many contemporary networks.
Machine learning researchers and engineers who need efficient, high-accuracy convolutional neural networks for computer vision tasks, particularly those working with GPU-accelerated training and inference pipelines.
Developers choose TResNet because it prioritizes actual GPU efficiency over theoretical FLOPs count, achieving state-of-the-art accuracy on ImageNet and transfer learning datasets while offering higher images-per-second throughput than ResNet50 and other networks with similar accuracy.
Official Pytorch Implementation of "TResNet: High-Performance GPU-Dedicated Architecture" (WACV 2021)
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Architectural modifications maximize GPU utilization, with TResNet-M achieving 2930 images/sec inference speed on V100, faster than ResNet50's 2830 images/sec, as shown in benchmark tables.
Achieves state-of-the-art results, such as 80.7% top-1 accuracy on ImageNet with TResNet-M, and sets SOTA on transfer learning datasets like Stanford Cars (96.0%) and Oxford-Flowers (99.1%).
Excels in multi-task settings, including multi-label classification where TResNet-L reached 86.4 mAP on MS-COCO, surpassing previous SOTA by over 2.5%, and object detection tasks.
Transfers well from ImageNet pretraining, with ImageNet21K pretrained weights boosting TResNet-M from 80.7% to 83.1% accuracy on ImageNet, and achieving top scores on competitive datasets.
The repository does not provide the exact training code used for article results, relying on external implementations in rwightman/pytorch-image-models, which can hinder reproducibility and customization.
TResNet-M has 5.5 GFLOPs compared to EfficientNet-B1's 0.6 GFLOPs, making it less suitable if FLOPs are a primary metric, despite better actual GPU throughput as admitted in the README.
Requires the Inplace-ABN layer, which adds setup complexity and may not be as widely supported as standard batch normalization, with the README including separate tips for working with it.