Wine

Donated on 6/30/1991

Using chemical analysis to determine the origin of wines

Dataset Characteristics

Tabular

Subject Area

Physics and Chemistry

Associated Tasks

Classification

Feature Type

Integer, Real

# Instances

178

# Features

Dataset Information

For what purpose was the dataset created?

test

Additional Information

These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. I think that the initial data set had around 30 variables, but for some reason I only have the 13 dimensional version. I had a list of what the 30 or so variables were, but a.) I lost it, and b.), I would not know which 13 variables are included in the set. The attributes are (dontated by Riccardo Leardi, riclea@anchem.unige.it ) 1) Alcohol 2) Malic acid 3) Ash 4) Alcalinity of ash 5) Magnesium 6) Total phenols 7) Flavanoids 8) Nonflavanoid phenols 9) Proanthocyanins 10)Color intensity 11)Hue 12)OD280/OD315 of diluted wines 13)Proline In a classification context, this is a well posed problem with "well behaved" class structures. A good data set for first testing of a new classifier, but not very challenging.

Has Missing Values?

Introductory Paper

Comparative analysis of statistical pattern recognition methods in high dimensional settings

By S. Aeberhard, D. Coomans, O. Vel. 1994

Published in Pattern Recognition

Variables Table

Variable Name	Role	Type	Missing Values
class	Target	Categorical	no
Alcohol	Feature	Continuous	no
Malicacid	Feature	Continuous	no
Ash	Feature	Continuous	no
Alcalinity_of_ash	Feature	Continuous	no
Magnesium	Feature	Integer	no
Total_phenols	Feature	Continuous	no
Flavanoids	Feature	Continuous	no
Nonflavanoid_phenols	Feature	Continuous	no
Proanthocyanins	Feature	Continuous	no

Rows per page

0 to 10 of 14

Additional Variable Information

All attributes are continuous No statistics available, but suggest to standardise variables for certain uses (e.g. for us with classifiers which are NOT scale invariant) NOTE: 1st attribute is class identifier (1-3)

Baseline Model Performance

Dataset Files

File	Size
wine.data	10.5 KB
wine.names	3 KB
Index	105 Bytes

Papers Citing this Dataset

An information criterion for auxiliary variable selection in incomplete data analysis

By Shinpei Imori, Hidetoshi Shimodaira. 2019

Published in Entropy.

Clustering through the optimal transport barycenter problem

By Hongkang Yang, Esteban Tabak. 2019

Published in

Causal Regularization

By Dominik Janzing. 2019

Published in ArXiv.

Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning

By Takanori Fujiwara, Oh-Hyun Kwon, Kwan-Liu Ma. 2019

Published in ArXiv.

In Defense of Synthetic Data

By Luke Rodriguez, Bill Howe. 2019

Published in ArXiv.

Rows per page

0 to 5 of 131

Reviews

There are no reviews for this dataset yet.

Download (5.9 KB)

131 citations

379849 views

Keywords

Chemistry

Creators

Stefan Aeberhard

M. Forina

DOI

10.24432/C5PC7J

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.

Wine

Donated on 6/30/1991

Dataset Characteristics

Subject Area

Associated Tasks

Feature Type

# Instances

# Features

Dataset Information

Introductory Paper

Variables Table

Additional Variable Information

Baseline Model Performance

Dataset Files

Papers Citing this Dataset

Reviews

Write a Review

Keywords

Creators

DOI

License