A Rust library for parsing and generating documents across 13+ formats using a unified Common Document Model.
Shiva is a Rust library that implements a parser and generator for documents of any type. It solves the problem of working with multiple document formats by providing a unified Common Document Model (CDM) that serves as an intermediate representation, enabling seamless conversion between formats like HTML, Markdown, PDF, DOCX, and JSON.
Rust developers who need to process, convert, or generate documents across multiple formats in their applications, particularly those building document processing pipelines, reporting tools, or content conversion systems.
Developers choose Shiva because it provides a single, consistent Rust API for working with 13+ document formats, eliminating the need to integrate multiple specialized libraries. Its Common Document Model ensures structural consistency during conversions, and its native Rust implementation offers performance and safety benefits.
Shiva library: Implementation in Rust of a parser and generator for documents of any type
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
The Common Document Model serves as a consistent intermediate representation for all supported formats, enabling seamless conversions without managing multiple disparate APIs.
Parses and generates over a dozen formats including HTML, Markdown, PDF, and DOCX, as shown in the supported document types table, reducing integration complexity.
Handles key elements like headers, tables, and images across formats, detailed in the parse and generate feature tables, ensuring content integrity during conversions.
Implemented in Rust, it leverages memory safety and efficient performance, making it suitable for high-throughput document processing tasks.
Includes command-line interface and HTTP server for out-of-box conversion without writing code, increasing accessibility for quick tasks or integration into workflows.
Some formats have limited support; for example, PDF parsing only handles paragraphs and lists, missing headers and images, as per the parse document features table.
Image handling is not uniform—plain text and JSON lack image support in generation, and many formats have partial support, which limits multimedia document processing.
Adding new document types requires implementing Rust traits like TransformerTrait, which can be steep for developers unfamiliar with the library's internals or Rust.
With only one listed user (Metatron library), community support and ecosystem maturity are minimal, potentially affecting long-term maintenance and bug fixes.