A Rust library for character encoding conversion based on the WHATWG Encoding Standard.
Encoding is a Rust library that provides character encoding conversion capabilities, allowing developers to encode and decode text between various legacy encodings and Unicode. It solves the problem of handling international text data in applications by implementing the WHATWG Encoding Standard and offering robust error handling.
Rust developers working with text processing, internationalization (i18n), or data interoperability, especially those dealing with legacy systems or web standards.
Developers choose Encoding for its strict adherence to the WHATWG standard, comprehensive encoding support, and flexible error recovery mechanisms, making it a reliable choice for encoding-sensitive applications.
Character encoding support for Rust
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Implements the WHATWG Encoding Standard for web-compatible encoding and decoding, ensuring interoperability with browsers and web technologies.
Supports a wide range of encodings including ASCII, UTF-8, UTF-16, ISO 8859 family, Windows code pages, and Asian encodings like EUC-JP and GBK, covering all WHATWG-specified encodings and more.
Provides multiple trap strategies (Strict, Replace, Ignore, NcrEscape) for graceful error recovery, with the ability to define custom traps for unrepresentable characters.
Offers a Cargo feature to reduce encoding table size from ~480KB to ~185KB, beneficial for end users optimizing binary footprint, though at a performance cost.
RawEncoder and RawDecoder are marked as experimental and can change substantially, making them unsuitable for production use without risk of breaking changes.
Enabling the size optimization feature (no-optimized-legacy-encoding) slows down encoding performance by 5x to 20x, which may be unacceptable for high-throughput applications.
The README explicitly states that ISO-2022-JP support is not yet up to date with the current standard, potentially leading to compatibility issues in specialized use cases.