A TensorFlow 2 library providing simple, composable abstractions for machine learning research via the snt.Module concept.
Sonnet is a TensorFlow-based neural network library developed by DeepMind, designed to provide simple, composable abstractions for machine learning research. It introduces the `snt.Module` concept as a core building block for constructing self-contained neural network components, enabling flexible and decoupled design for various ML tasks like supervised learning, reinforcement learning, and more.
Machine learning researchers and developers who need a flexible, unopinionated library for building and experimenting with neural networks in TensorFlow, particularly those who want fine-grained control over their model architecture and training process.
Developers choose Sonnet for its simplicity, clarity, and powerful yet unopinionated programming model centered around modules, which allows for full control over training and model design without being tied to a specific framework, making it ideal for research and custom implementations.
TensorFlow-based neural network library
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
The snt.Module abstraction enables composable, self-contained neural network components, as shown with easy construction of MLPs and custom layers like MyLinear.
Sonnet doesn't impose a training framework, giving users full autonomy over workflows, which is ideal for research and custom implementations, as emphasized in the philosophy.
Compatible with TensorFlow checkpointing and Saved Model for saving and loading models, including distributed setups, though it requires extra steps for export.
Includes modules like snt.distribute.CrossReplicaBatchNorm for custom distribution strategies, allowing multi-GPU training without automatic assumptions, as seen in the Cifar-10 example.
Users must build or adopt their own training loops, optimizers, and metrics, increasing initial development time compared to libraries with baked-in training.
Saving models for production requires creating separate modules with tf.function and input signatures, adding boilerplate and complexity, as detailed in the serialization section.
As a DeepMind-specific library, it has fewer third-party extensions, pre-trained models, and community resources than mainstream frameworks like Keras or PyTorch.