Kinship
Donated on 6/30/1990
Relational dataset
Dataset Characteristics
Relational
Subject Area
Social Science
Associated Tasks
Relational-Learning
Feature Type
Categorical
# Instances
104
# Features
12
Dataset Information
Additional Information
This relational database consists of 24 unique names in two families (they have equivalent structures). Hinton used one unique output unit for each person and was interested in predicting the following relations: wife, husband, mother, father, daughter, son, sister, brother, aunt, uncle, niece, and nephew. Hinton used 104 input-output vector pairs (from a space of 12x24=288 possible pairs). The prediction task is as follows: given a name and a relation, have the outputs be on for only those individuals (among the 24) that satisfy the relation. The outputs for all other individuals should be off. Hinton's results: Using 100 vectors as input and 4 for testing, his results on two passes yielded 7 correct responses out of 8. His network of 36 input units, 3 layers of hidden units, and 24 output units used 500 sweeps of the training set during training. Quinlan's results: Using FOIL, he repeated the experiment 20 times (rather than Hinton's 2 times). FOIL was correct 78 out of 80 times on the test cases.
Has Missing Values?
No
Variable Information
-- The relation names are: wife husband mother father daughter son sister brother aunt uncle niece nephew
Dataset Files
File | Size |
---|---|
kinship.data | 2.6 KB |
kinship.names | 1.9 KB |
Index | 114 Bytes |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset kinship = fetch_ucirepo(id=55) # data (as pandas dataframes) X = kinship.data.features y = kinship.data.targets # metadata print(kinship.metadata) # variable information print(kinship.variables)
Hinton, G. (1986). Kinship [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5WS4D.
Creators
Geoff Hinton
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.