Gender by Name

Donated on 3/14/2020

This dataset attributes first names to genders, giving counts and probabilities. It combines open-source government data from the US, UK, Canada, and Australia.

Dataset Characteristics

Text

Subject Area

Social Science

Associated Tasks

Classification, Clustering

Feature Type

-

# Instances

147270

# Features

4

Dataset Information

Additional Information

This dataset combines raw counts for first/given names of male and female babies in those time periods, and then calculates a probability for a name given the aggregate count. Source datasets are from government authorities: -US: Baby Names from Social Security Card Applications - National Data, 1880 to 2019 -UK: Baby names in England and Wales Statistical bulletins, 2011 to 2018 -Canada: British Columbia 100 Years of Popular Baby names, 1918 to 2018 -Australia: Popular Baby Names, Attorney-General's Department, 1944 to 2019

Has Missing Values?

No

Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
NameFeatureCategoricalno
GenderFeatureCategoricalGenderno
CountFeatureIntegerno
ProbabilityFeatureContinuousno

0 to 4 of 4

Additional Variable Information

Name: String Gender: M/F (category/string) Count: Integer Probability: Float

Dataset Files

FileSize
name_gender_dataset.csv3.6 MB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (3.6 MB)
0 citations
14331 views

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy