Showing 9 of 9 projects
A privacy-focused AI answering engine that runs on your own hardware, combining web search with local and cloud LLMs.
Open-source AI platform for building private agents, assistants, and enterprise search with document analysis and multi-model support.
A Python library for extracting and analyzing text, images, and metadata from PDF documents.
A Windows tool for extracting metadata and hidden information from documents found on web pages and local files.
A C library for efficient image processing and analysis, widely used in OCR and computer vision applications.
A curated list of resources for Document Understanding (DU), covering research, datasets, tools, and applications in Intelligent Document Processing.
A curated collection of learning resources, R packages, and practical examples for understanding and applying topic modeling techniques.
Serverless data pipeline for crawling PDFs from the web and extracting structured data using AWS Textract.
A Python toolbox using deep belief networks for topic modeling on document data, producing latent representations for content-based recommendation.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.