A declarative tool for generating realistic, scalable test data from code or existing databases.
Synth is a declarative data generator that creates realistic, scalable test data for software development and testing. It solves the problem of needing high-quality, privacy-safe data for populating new schemas, conducting integration tests, and simulating database growth without using production data. Users define data models in code, which Synth uses to generate millions of rows of semantically rich data.
Developers and QA engineers who need to generate realistic test data for applications, especially those working with databases and requiring scalable, repeatable data generation for development, testing, or staging environments.
Synth offers a unique declarative, code-first approach to data generation, allowing teams to version control data models and automate data creation. Its database-agnostic design and ability to infer schemas from existing databases provide flexibility and save time compared to manual or script-based data generation tools.
The Declarative Data Generator
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Allows specifying entire data models as versionable code using JSON configuration, enabling peer review and automation, as shown in the 'users.json' example in the README.
Can import from Postgres, MySQL, and MongoDB to automatically deduce relations and types, saving manual effort in creating data models from existing databases.
Supports both SQL and NoSQL databases with semi-structured data, making it flexible for diverse tech stacks without vendor lock-in.
Leverages the fake-rs crate to generate rich, realistic data like names and addresses, ensuring test datasets are believable and useful for development.
Built in Rust for performance, it can scale to millions of rows of data, addressing the need to test system performance under load.
Currently in Public Alpha with 'a few kinks,' meaning it may have bugs, incomplete features, and breaking changes, as admitted in the status section of the README.
Defining complex data models in JSON can be cumbersome and error-prone, requiring a steep learning curve compared to GUI-based or simpler scripting tools.
As a niche tool, it lacks extensive community plugins, integrations, or third-party support, which might hinder adoption in enterprise environments.