A weekly updated dataset of Dungeons & Dragons characters submitted to character sheet web applications, with over 7,900 entries and standardized fields.
dnddata is an open-source dataset of Dungeons & Dragons characters collected from submissions to character sheet web applications. It provides a large, standardized collection of character attributes—such as race, class, abilities, and spells—for analysis and research. The dataset addresses the need for accessible, clean data on D&D character trends and demographics.
Data analysts, researchers, and D&D enthusiasts interested in exploring character statistics, trends, and demographics within the D&D community. R users and data scientists working with gaming datasets will find it particularly useful.
Developers choose dnddata for its large, weekly updated sample size, standardized fields that handle free-text inconsistencies, and availability in multiple formats (R, JSON, TSV). It offers a unique, community-sourced dataset not readily available elsewhere.
A dataset of D&D characters submitted to https://oganm.com/shiny/printSheetApp and https://oganm.com/shiny/interactiveSheet. A superset of characters used in oganm/dndstats
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
With over 7,900 characters and weekly automatic updates, it provides a substantial sample size for trend analysis, as noted in the README's examples and feature list.
Available as R data frames, JSON, and TSV files in the data-raw directory, facilitating easy integration with various programming languages and tools beyond R.
Includes processed fields like processedRace and processedSpells that clean up free-text inputs using heuristics, ensuring consistency for demographic studies.
Correlates with external analyses like FiveThirtyEight's D&D article, supporting reproducible research and validation of trends, as mentioned in the README.
Data is sourced from niche web apps advertised on Reddit communities, skewing towards a specific subset of D&D players and potentially overrepresenting test characters.
Processed fields like processedSpells use string matching with admitted error rates (e.g., 2/200 mistakes), compromising absolute accuracy for critical applications.
Filtering for unique characters relies on heuristics based only on name and class, which may incorrectly exclude valid entries or include duplicates, as cautioned in the caveats.