A modern Java/Kotlin/Groovy library for generating realistic fake data with extensive providers and format support.
Datafaker is a JVM library for generating realistic fake data during software development and testing. It provides a wide range of data providers (names, addresses, companies, etc.) and supports multiple output formats like CSV and JSON. The library helps developers create believable test datasets without manual data entry.
Java, Kotlin, and Groovy developers who need to populate databases, create mock APIs, or generate sample data for application demos and testing scenarios.
Datafaker offers a modern, maintained fork of java-faker with updated dependencies, extensive locale support, and a rich expression system. Its schema-based format generation and unique value guarantees make it more powerful than basic random data generators.
Generating fake data for the JVM (Java, Kotlin, Groovy) has never been easier!
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
With over 150 built-in providers spanning names, addresses, pop culture, and domain-specific fields like finance and healthcare, it eliminates the need to manually source or code test data. The README lists providers from Address to Zelda, demonstrating breadth.
Supports generating data directly into CSV, JSON, XML, and YAML using a declarative schema, making it easy to create structured mock APIs or database dumps. Examples in the README show how to define fields and transformers for formats.
Leverages standard Java Locales to produce region-specific data like US zip codes or Japanese names, which is crucial for testing internationalization. The README provides code snippets for locale-based initialization and lists supported locales.
The unique value guarantee ensures no repeats within a session, essential for generating distinct test datasets without collisions. The documentation highlights methods like `faker.unique().fetchFromYaml()` for this purpose.
Datafaker 2.x mandates Java 17, forcing teams on older LTS versions like Java 8 or 11 to rely on the unmaintained 1.x branch, as noted in the README's upgrade warning.
Native image compatibility is labeled as experimental and relies on manually curated metadata, which can lead to runtime issues in GraalVM or Quarkus deployments, as admitted in the NATIVE IMAGE section.
Advanced features like the expression language or custom provider setup require diving into complex documentation and schema definitions, which can be overwhelming for simple data generation tasks.