A self-hosted document management system that scans, indexes, and archives paper documents with OCR and encryption.
Paperless is an open-source document management system that digitizes paper documents by scanning, performing OCR, indexing, and archiving them securely. It solves the problem of physical document clutter and loss by converting paperwork into searchable, encrypted digital files. The system automates ingestion from scanners and provides a web interface for easy retrieval.
Individuals, small businesses, or organizations looking to reduce paper usage and manage documents digitally with full control over their data. It's ideal for those handling sensitive paperwork like tax records, invoices, or legal documents who prefer self-hosting.
Developers choose Paperless for its privacy-focused, self-hosted design, seamless integration with existing scanners, and use of robust open-source tools like Tesseract and GPG. It offers a lightweight, customizable alternative to complex enterprise document management systems.
Scan, index, and archive all of your paper documents
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Paperless supports network scanners with FTP upload and local directory consumption, enabling hands-free document ingestion without proprietary software.
Uses GPG to encrypt original PDF files at rest, and the README explicitly recommends local hosting to keep sensitive data off untrusted servers.
Relies on battle-tested open-source tools like Tesseract for OCR and Unpaper for image cleanup, ensuring reliable text extraction from scans.
Built on Django with a minimalist philosophy, it avoids bloat and gives users full control over data, unlike cloud-based alternatives.
The project is archived with no active development; the maintainer recommends Paperless-ng for updates, leaving users without official support or bug fixes.
OCR text is stored in plaintext in the database for searchability, creating a security hole if the system is compromised, as noted in the README's 'Important Note'.
Requires configuring multiple dependencies like Tesseract, ImageMagick, and GPG, plus Django deployment—no out-of-the-box installer or Docker image in the main repo.