A deep learning-based solution for automatically recognizing and solving 12306 railway website captchas.
12306-captcha is a deep learning project designed to automatically recognize and solve the image-based captchas on China's 12306 railway ticket booking website. It trains convolutional neural networks to identify both the objects and text labels in these captchas, enabling automation of booking processes. The project provides a complete pipeline from data collection and preprocessing to model training and deployment.
Developers and researchers working on web automation, bot development, or computer vision projects who need to bypass 12306's captcha system. It's also suitable for those learning practical deep learning applications with Caffe.
It offers a specialized, open-source solution for a widely encountered problem in China, with a reproducible training workflow and a ready-to-use web demo. Unlike generic captcha-solving services, it's tailored specifically to 12306's unique captcha format.
基于深度学习的12306验证码识别
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses dual models for image objects and text labels, trained specifically for 12306's captcha format, as shown in the separate training pipelines for images and words.
Provides end-to-end scripts for data download, cropping, preprocessing, and model training, offering a reproducible process from the README's step-by-step instructions.
Includes a simple web interface to test trained models on new captchas, making validation straightforward, as demonstrated in the provided screenshot and index.py script.
The README details setup, data handling, and training with specific scripts, accessible for users familiar with Caffe and Python.
Requires hand-labeling images into categories after cropping, which is time-consuming and inefficient, as admitted in the README's note on improving efficiency.
Built on Caffe, which has a declining ecosystem and fewer resources compared to modern frameworks, making integration with newer tools challenging.
Processes images by cropping and classifying individually instead of using object detection, leading to slower performance, as noted in the project's future optimization suggestions.