A fast, deterministic MySQL dump anonymizer that preserves referential integrity while anonymizing sensitive data.
Myanon is a specialized tool for anonymizing MySQL database dumps, enabling safe sharing of production data for development, testing, or analytics. It processes SQL dumps as a stream with zero intermediate storage, ensuring efficiency with large datasets while maintaining foreign key relationships through deterministic hashing.
Database administrators, developers, and data engineers who need to share or work with production MySQL data in non-production environments without exposing sensitive information.
Developers choose Myanon for its deterministic hashing that automatically preserves referential integrity across tables, its stream-based performance that handles large datasets efficiently, and its extensible architecture with optional Python support for custom anonymization logic.
A mysqldump anonymizer
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses HMAC-SHA256 hashing to produce consistent anonymized values, automatically preserving foreign key relationships across tables, as demonstrated in the demo where emails hash to the same value in different tables.
Includes a built-in parser for anonymizing specific paths within JSON columns, supporting nested objects and arrays, which is detailed in the features table with the `json` rule.
Processes SQL dumps directly without intermediate storage, making it fast and memory-efficient for large datasets, as emphasized in the README's performance claims.
Offers optional Python support for custom anonymization logic, such as integrating with Faker for realistic fake data, with utilities like get_row() for cross-field logic.
Building from source requires autoconf, automake, flex, bison, and other dependencies, which can be cumbersome and error-prone compared to simple package managers, as noted in the installation instructions.
Only works with MySQL database dumps, so it's not adaptable to other database systems without significant modification, restricting its use in heterogeneous environments.
Uses a custom configuration syntax with back-quoted names and various rules, which might be confusing and prone to errors for users managing complex schemas or multiple tables.