
Gender by Name
Donated on 3/15/2020
This dataset attributes first names to genders, giving counts and probabilities. It combines open-source government data from the US, UK, Canada, and Australia.
Dataset Characteristics
Text
Subject Area
Social
Associated Tasks
Classification, Clustering
Attribute Type
-
# Instances
147270
# Attributes
4
Information
Additional Information
This dataset combines raw counts for first/given names of male and female babies in those time periods, and then calculates a probability for a name given the aggregate count. Source datasets are from government authorities: -US: Baby Names from Social Security Card Applications - National Data, 1880 to 2019 -UK: Baby names in England and Wales Statistical bulletins, 2011 to 2018 -Canada: British Columbia 100 Years of Popular Baby names, 1918 to 2018 -Australia: Popular Baby Names, Attorney-General's Department, 1944 to 2019
Attribute Information
Additional Information
Name: String Gender: M/F (category/string) Count: Integer Probability: Float
Gender by Name. (2020). UCI Machine Learning Repository. https://doi.org/10.24432/C55G7X.
@misc{misc_gender_by_name_591, title = {{Gender by Name}}, year = {2020}, howpublished = {UCI Machine Learning Repository}, note = {{DOI}: https://doi.org/10.24432/C55G7X} }
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.