Wine

Donated on 6/30/1991

Using chemical analysis to determine the origin of wines

Dataset Characteristics

Tabular

Subject Area

Physical Science

Associated Tasks

Classification

Feature Type

Integer, Real

# Instances

178

# Features

13

Dataset Information

For what purpose was the dataset created?

test

Additional Information

These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. I think that the initial data set had around 30 variables, but for some reason I only have the 13 dimensional version. I had a list of what the 30 or so variables were, but a.) I lost it, and b.), I would not know which 13 variables are included in the set. The attributes are (dontated by Riccardo Leardi, riclea@anchem.unige.it ) 1) Alcohol 2) Malic acid 3) Ash 4) Alcalinity of ash 5) Magnesium 6) Total phenols 7) Flavanoids 8) Nonflavanoid phenols 9) Proanthocyanins 10)Color intensity 11)Hue 12)OD280/OD315 of diluted wines 13)Proline In a classification context, this is a well posed problem with "well behaved" class structures. A good data set for first testing of a new classifier, but not very challenging.

Has Missing Values?

No

Introductory Paper

Comparative analysis of statistical pattern recognition methods in high dimensional settings

By S. Aeberhard, D. Coomans, O. Vel. 1994

Published in Pattern Recognition

Variables Table

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
classTargetCategoricalno
AlcoholFeatureContinuousno
MalicacidFeatureContinuousno
AshFeatureContinuousno
Alcalinity_of_ashFeatureContinuousno
MagnesiumFeatureIntegerno
Total_phenolsFeatureContinuousno
FlavanoidsFeatureContinuousno
Nonflavanoid_phenolsFeatureContinuousno
ProanthocyaninsFeatureContinuousno

0 to 10 of 14

Additional Variable Information

All attributes are continuous No statistics available, but suggest to standardise variables for certain uses (e.g. for us with classifiers which are NOT scale invariant) NOTE: 1st attribute is class identifier (1-3)

Baseline Model Performance

Papers Citing this Dataset

An information criterion for auxiliary variable selection in incomplete data analysis

By Shinpei Imori, Hidetoshi Shimodaira. 2019

Published in Entropy.

Clustering through the optimal transport barycenter problem

By Hongkang Yang, Esteban Tabak. 2019

Published in

Causal Regularization

By Dominik Janzing. 2019

Published in ArXiv.

Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning

By Takanori Fujiwara, Oh-Hyun Kwon, Kwan-Liu Ma. 2019

Published in ArXiv.

In Defense of Synthetic Data

By Luke Rodriguez, Bill Howe. 2019

Published in ArXiv.

0 to 5 of 131

Download
131 citations
159762 views

Keywords

Chemistry

Creators

Stefan Aeberhard

M. Forina

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy