Question 1

Should I use CleanRL or Stable-baselines3 for my RL project?

Accepted Answer

CleanRL is best for understanding algorithm details and prototyping due to its transparent single-file implementations, while Stable-baselines3 is a modular library better suited for production and ease of use. Choose based on whether you need educational clarity or modular integration.

Question 2

How do I run CleanRL on Atari games?

Accepted Answer

Install the Atari dependencies with `pip install -r requirements/requirements-atari.txt`, then use scripts like `python cleanrl/ppo_atari.py --env-id BreakoutNoFrameskip-v4`. You can add logging flags for Tensorboard or Weights and Biases tracking.

Question 3

Does CleanRL support JAX or only PyTorch?

Accepted Answer

CleanRL includes both PyTorch and JAX implementations; for example, it offers `dqn_jax.py` and `ppo_atari_envpool_xla_jax.py` for JAX-based variants, leveraging performance benefits on compatible hardware like TPUs.

Question 4

How can I scale CleanRL experiments to the cloud?

Accepted Answer

Use Docker to containerize your experiments and leverage AWS Batch integration as documented. This allows running thousands of parallel jobs, with examples provided in the cloud features section of the README.

Question 5

What are the system requirements for running CleanRL?

Accepted Answer

CleanRL requires Python 3.7.1 to 3.10, and specific dependencies for target environments like Atari or Mujoco. GPU support is optional but recommended for faster training on complex tasks.

Question 6

How do I reproduce results from the Open RL Benchmark?

Accepted Answer

Use the provided CleanRL scripts with identical hyperparameters and seeds listed in the benchmark reports. The Open RL Benchmark website offers interactive reports to cross-reference configurations for accurate reproduction.

Question 7

Can I modify CleanRL for custom environments?

Accepted Answer

Yes, but you'll need to edit the single-file scripts directly, as CleanRL is not modular. This requires understanding the code structure and adapting it manually, which can be educational but time-consuming.

cleanrl

What is cleanrl?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions