A proof-of-concept system that defeats Google's audio reCaptcha with 85% accuracy using speech recognition and browser automation.
unCaptcha is a proof-of-concept system designed to defeat Google's audio reCaptcha with 85% accuracy. It automates the process of solving audio captchas by using speech recognition services and browser automation to mimic human interaction. The project highlights vulnerabilities in widely used bot-detection systems.
Security researchers, academics, and developers interested in captcha vulnerabilities, automated testing, and web security. It's also relevant for those studying speech recognition and browser automation techniques.
unCaptcha provides a documented, open-source method to analyze and bypass audio captchas, serving as a valuable resource for understanding security weaknesses. Its multi-service ensembling approach demonstrates innovative techniques for improving speech recognition accuracy in adversarial contexts.
Defeating Google's audio reCaptcha with 85% accuracy.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Combines results from six speech recognition services (IBM, Google Cloud, etc.) using a probabilistic heuristic, achieving high accuracy in number identification as documented in the paper.
Published at USENIX WOOT '17 with a detailed paper and slides, providing thorough research backing and transparency into the methodology.
Explicitly framed as a proof-of-concept to highlight security vulnerabilities, with warnings about responsible disclosure and legal compliance.
Uses Selenium to simulate human interaction with websites, demonstrating a full attack chain from navigation to captcha solving on Reddit.
The README admits Google has updated reCaptcha with additional protections, making this tool no longer effective and explicitly states it is not maintained.
Requires installation of multiple tools like sox, ffmpeg, and Selenium, plus API keys from six external services, which is cumbersome and time-consuming.
Specifically targets Reddit's signup page and audio captchas only, with no adaptation for other sites or captcha types, reducing its utility beyond research.
Using this tool could violate terms of service or laws, and the project warns users to be mindful of legal implications, adding overhead for responsible use.