A deep learning model that reads IRCTC captchas with 98% accuracy, demonstrating their vulnerability to automated booking.
Captcha.irctc is a deep learning project that trains neural networks to automatically read and solve captcha images from the IRCTC railway booking website. It demonstrates how machine learning can bypass traditional security measures designed to stop bots, achieving 98% accuracy in reading the text from these captchas. The project highlights the vulnerability of IRCTC's current captcha system to automated ticket booking software.
Machine learning researchers, security enthusiasts, and developers interested in computer vision, captcha vulnerabilities, or applying deep learning to real-world problems.
It provides a working, high-accuracy model specifically for IRCTC captchas, using a residual network architecture and custom training criteria. The project serves as a practical case study in breaking captcha systems with deep learning, complete with code and methodology.
Reading irctc captchas with 98% accuracy using deep learning
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Achieves 98% accuracy on IRCTC captchas using a residual network architecture, as demonstrated in the README with test results.
Employs a 34-layer residual network and custom MultiCrossEntropyCriterion for sequential character recognition without RNNs, based on VGG designs.
Implements augmentation techniques to improve model robustness, enhancing performance on varied captcha samples as noted in the code updates.
Designed for NVIDIA GPUs like Titan 780, speeding up training significantly, as stated in the requirements section.
The model is trained only for IRCTC's captcha style and requires modifications for other websites, limiting generalizability.
Requires installation of Torch, CUDA, cutorch, and csvigo, which can be challenging for users unfamiliar with these niche tools.
Needs around 10,000 labeled samples for training, as admitted in the README, making it resource-intensive to replicate for new applications.
Demonstrates vulnerability that could be misused for automated ticket booking, raising concerns about responsible use and potential violations.