A Python library for reading, writing, repairing, and transforming PDFs, powered by the qpdf C++ library.
pikepdf is a Python library for reading, writing, repairing, and transforming PDF files, powered by the qpdf C++ library. It solves the problem of handling PDFs programmatically by providing a robust, Pythonic API that can automatically repair damaged files, manipulate pages, edit metadata, and work with encrypted documents while preserving PDF/A compliance.
Python developers who need to programmatically manipulate existing PDFs, such as those building tools for PDF repair, merging, splitting, metadata editing, or low-level PDF object surgery. It's also suitable for developers requiring encryption support or PDF/A compliance in their workflows.
Developers choose pikepdf because it combines the maturity and correctness of the battle-tested qpdf C++ library with a clean Python interface, offering automatic PDF repair, lossless operations, and comprehensive low-level access without the need for external dependencies or complex setup.
A Python library for reading and writing PDF, powered by QPDF
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Silently fixes structural damage when opening files, leveraging qpdf's mature repair capabilities to handle malformed PDFs without user intervention.
Provides dictionary-style access to PDF objects and list-style page manipulation, making it intuitive for Python developers familiar with the language's idioms.
Extracts and replaces images without re-encoding compressed formats like JPEG, preserving original quality and avoiding generation loss.
Supports AES-256, AES-128, and RC4 encryption for secure documents and maintains PDF/A conformance during edits, crucial for archival workflows.
The README explicitly states that digital signature-based encryption is not currently supported, limiting use cases for secure document validation.
Its API mirrors the PDF specification closely, which can be overwhelming for developers seeking only high-level operations like simple merges or splits.
Relies on the qpdf C++ library, which, while providing robustness, adds a compiled dependency that might complicate deployment in restricted environments compared to pure-Python alternatives.