Tesseract

12 projects

Showing 12 of 12 projects

tesseractC++

An open-source OCR engine that converts images to text, supporting over 100 languages and multiple output formats.

#c-plus-plus-library#hacktoberfest#open-source

A command-line tool that adds an OCR text layer to scanned PDF files, making them searchable and copy-pasteable.

#text-extraction#pdf-ocr#pdf-a

Stars34.3k

Forks2.4k

Last commit2 days ago

KreuzbergRust

A polyglot document intelligence framework with a Rust core for extracting text, metadata, and structured data from 91+ file formats.

#text-extraction#document-intelligence#batch-processing

Stars8.7k

Forks526

Last commit15 hours ago

TagUI (.2k)JavaScript

An open-source RPA tool that automates repetitive tasks on websites, desktop apps, and the command line using a simple language.

#ai#opencv#workflow-automation

Stars6.3k

Forks645

Last commit3 days ago

gosseractGo

A Go package for Optical Character Recognition (OCR) using the Tesseract C++ library.

#text-extraction#tesseract-ocr#go-library

Stars3.1k

Forks307

Last commit6 months ago

Tess4JJava

A Java JNA wrapper for Tesseract OCR API, enabling OCR functionality in Java applications.

#text-extraction#pdf-ocr#java

Stars1.8k

Forks381

Last commit1 month ago

TextSnatcherVala

A lightweight Linux desktop application that extracts text from images using OCR with drag-and-drop simplicity.

#text-extraction#libhandy#tesseract-ocr

Stars1.4k

Forks54

Last commit2 years ago

ocrserverGo

A simple OCR API server that's easy to deploy with Docker or on Heroku.

#text-extraction#api#api-server

Stars767

Forks147

Last commit5 years ago

tesseract-ocrRuby

A Ruby wrapper library that provides Ruby bindings and a Ruby-esque interface to the Tesseract OCR API.

#ruby-wrapper#ffi#tesseract-ocr

Stars636

Forks71

Last commit9 years ago

max-ocrPython

A Docker-based optical character recognition model that extracts text from images using Tesseract.

#ibm-cloud#microservice#rest-api

Stars51

Forks31

Last commit10 months ago

Wagtail-TextractPython

Enables full-text search within uploaded documents (PDF, Word, Excel) in Wagtail CMS.

#search#text-extraction#pdf-search

Stars34

Forks14

Last commit2 years ago

vesseractV

A V programming language wrapper for Tesseract-OCR, enabling text extraction and OCR operations from images.

#text-extraction#document-analysis#wrapper-library

Stars17

Forks3

Last commit4 years ago

Related Tags

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub