A reliable PyTorch implementation of reinforcement learning algorithms for research and industry.
Stable Baselines3 is a PyTorch-based library providing reliable implementations of reinforcement learning algorithms. It solves the problem of inconsistent or buggy RL codebases by offering well-tested, performant versions of algorithms like PPO, SAC, and DQN, enabling researchers and developers to build upon a solid foundation.
Reinforcement learning researchers, AI practitioners, and industry developers who need robust, production-ready RL implementations for experimentation, benchmarking, and application development.
Developers choose Stable Baselines3 for its emphasis on reliability, comprehensive testing, and clean API, which reduces implementation errors and accelerates RL project development compared to building algorithms from scratch or using less stable alternatives.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Each algorithm is performance-tested and verified, providing trustworthy baselines for research and industry, as highlighted in the README's results section and OpenRL Benchmark reports.
Offers a sklearn-like interface across all algorithms, simplifying usage and reducing boilerplate code, as demonstrated in the example code for training PPO with just a few lines.
Supports custom Gymnasium environments and user-defined neural network policies, enabling flexibility for diverse RL applications without modifying core library code.
Includes built-in Tensorboard logging, high code coverage with type hints, and integrations with services like Weights & Biases and Hugging Face for enhanced experimentation and model sharing.
Tied exclusively to PyTorch, which may not suit teams using other frameworks like TensorFlow or Jax, and lacks the speed optimizations of Jax-based versions like SBX mentioned in the README.
Newer or experimental algorithms are maintained in the SB3-Contrib repository, adding complexity for users who need to manage multiple packages and potentially deal with less stable code.
Despite a simple API, the library explicitly notes it expects users to understand reinforcement learning fundamentals, making it less accessible for complete beginners without additional learning resources.