High-performance, end-to-end reinforcement learning implementations fully written in JAX for massive parallelization on GPUs.
PureJaxRL is a high-performance reinforcement learning library that implements end-to-end training pipelines entirely in JAX. It solves the problem of slow RL experimentation by enabling massive parallelization of agents on GPUs, achieving speedups of over 1000x compared to traditional PyTorch implementations. The library allows researchers to JIT compile complete training loops including environments for optimal performance.
Reinforcement learning researchers and practitioners who need to run large-scale experiments, hyperparameter tuning, or meta-RL algorithms efficiently on GPU hardware.
Developers choose PureJaxRL for its unmatched performance gains through full JAX implementation, enabling parallel training of thousands of agents simultaneously. Its single-file, research-friendly design makes it ideal for algorithm discovery and rapid prototyping compared to modular but slower alternatives.
Really Fast End-to-End Jax RL Implementations
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Achieves over 1000x speedup compared to PyTorch implementations by JIT compiling entire training loops, as demonstrated in performance plots for Cartpole and Minatar-Breakout.
Leverages JAX's vmap and pmap to run thousands of agents simultaneously on a single GPU, enabling rapid hyperparameter tuning and meta-RL research, as highlighted in the README.
Provides clean, single-file implementations inspired by CleanRL, making it easy to modify and understand for algorithm development and experimentation.
Fully synchronous system simplifies debugging compared to asynchronous frameworks, reducing complexity in tracing training loops, as noted in the README.
Implemented as single-file scripts not meant for import, making it difficult to integrate into larger projects or reuse components, as admitted in the code philosophy section.
Relies entirely on JAX, which has a steeper learning curve and smaller community compared to PyTorch, limiting support, resources, and compatibility with non-JAX environments.
Focuses on a few implementations like PPO, lacking the breadth of algorithms found in more comprehensive RL libraries, which may restrict use cases beyond the provided examples.