Scene text detection using Connectionist Text Proposal Network (CTPN) for detecting text lines in natural images.
CTPN is a deep learning model for detecting text lines in natural scene images. It uses a Connectionist Text Proposal Network architecture that combines convolutional and recurrent neural networks to accurately localize text in unconstrained environments. The project provides an implementation and pre-trained model for scene text detection tasks.
Computer vision researchers and developers working on optical character recognition (OCR), document analysis, or scene understanding who need robust text detection in images.
CTPN offers a specialized, research-backed approach to text detection that outperforms generic object detectors for text localization. Its open-source implementation and pre-trained model allow developers to integrate state-of-the-art text detection without training from scratch.
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Based on the ECCV 2016 paper, CTPN combines CNN and RNN architectures to capture text sequence context, providing robust detection in natural scenes.
Offers a 78MB trained model ready for inference, saving significant time and resources compared to training from scratch.
Optimized for GPU with CUDNN, requiring about 1.5GB memory for faster processing, as noted in the README.
Designed specifically for text-line detection, treating text as sequences of fine-scale proposals to outperform generic object detectors.
Requires compiling Caffe with legacy dependencies like Python2.7, CUDA 7.0, and CUDNN 3.0, which are difficult to install on modern systems.
Focuses on horizontal text lines without side-refinement, making it ineffective for detecting rotated or curved text.
The README admits the CPU implementation is non-optimal and extremely slow, necessitating a GPU for practical use.