An open-source implementation reproducing DeepMind's Atari-playing deep reinforcement learning system from their seminal 2013 paper.
Replicating-DeepMind is an open-source implementation that reproduces DeepMind's seminal 2013 deep reinforcement learning system for playing Atari games. It provides a functional codebase that learns to play Atari 2600 games directly from pixel input using deep Q-learning, serving as both a research reproduction and educational tool for understanding how deep reinforcement learning works.
Machine learning researchers, AI students, and developers interested in understanding and experimenting with deep reinforcement learning, particularly those studying the historical development of AI systems that learn from raw sensory input.
This project offers a transparent, working implementation of a landmark AI system that's fully open-source and modifiable, unlike DeepMind's proprietary original. It provides hands-on experience with deep reinforcement learning while documenting the practical challenges of reproducing research results.
Reproducing the results of "Playing Atari with Deep Reinforcement Learning" by DeepMind
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Provides a working implementation of DeepMind's landmark 2013 paper, offering hands-on learning for deep Q-learning from pixel input, as highlighted in the project's focus on replicating research.
Runs on GPU clusters using cuda-convnet2 for efficient neural network training, making it faster for experimentation, as noted in the README's mention of cluster usage.
Tracks learning progress and compares against random baselines, helping users understand system behavior and improvement over time.
Serves as a foundation for extending DeepMind's architecture, with documentation in the Wiki for community-driven exploration and modification.
The README admits RMSprop is not implemented, a key optimization from the original paper, which hampers learning efficiency and accuracy.
It performs only slightly better than random and is about 2x slower than DeepMind's original system, limiting its usefulness for high-scoring applications.
Requires a GPU cluster and cuda-convnet2 setup, making it inaccessible for users without specialized infrastructure or technical expertise.