JAX (Flax) implementations of reinforcement learning algorithms for continuous action spaces, designed for research.
JAXRL is a collection of reinforcement learning algorithms implemented in JAX and Flax, specifically designed for continuous action space environments. It provides clean, research-focused implementations of algorithms like SAC, AWAC, DDPG, and REDQ to help researchers build upon and experiment with modern RL techniques.
Reinforcement learning researchers and practitioners who want to experiment with JAX-based implementations of continuous control algorithms, particularly those working on offline/online RL, pixel-based control, or algorithmic extensions.
It offers simple, modular implementations optimized for JAX's performance benefits (GPU acceleration, automatic differentiation) while maintaining readability for research prototyping, unlike more complex baseline repositories.
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Implementations are simple and modular, prioritizing readability for easy extension and experimentation, as stated in the repository's goal.
Leverages JAX for GPU acceleration and automatic differentiation, with installation notes for CUDA support enabling fast training on hardware.
Includes key algorithms like SAC with learnable temperature and AWAC for offline RL, supported by citations and example results for continuous control.
Explicitly aimed at prototyping and extensions rather than benchmarking, making it ideal for algorithmic modifications and new experiments.
Installation requires Poetry, specific system dependencies for MuJoCo, and manual GPU configuration, which can be error-prone and time-consuming.
The README points to an updated version (jaxrl2), suggesting this repository may be outdated or less actively supported for new features.
Focus on continuous control and specific algorithms means it lacks support for discrete actions or broader RL tasks, limiting its applicability.