Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Wine Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.

Return to Wine data set page.


Ping Zhong and Masao Fukushima. A Regularized Nonsmooth Newton Method for Multi-class Support Vector Machines. 2005.

the starting point of the next (k + 1)th iteration. The parameters º 1 and º 2 in (3) are both set 0.01. In Algorithm 3.1, we replaced the standard Armijo-rule in (S.3) by 10 Table 1: Six benchmark datasets from UCI name iris wine glass vowel vehicle segment #pts 150 178 214 528 846 2310 {fiats|flats} 4 13 9 10 18 19 #cls 3 3 6 11 4 7 #pts: the number of training data; {fiats|flats}: the number of


Igor Fischer and Jan Poland. Amplifying the Block Matrix Structure for Spectral Clustering. Telecommunications Lab. 2005.

are common benchmark sets with real-world data (Murphy & Aha, 1994): the iris, the wine and the breast cancer data set. Both our methods perform very well on iris and breast cancer. However, the wine data set is too sparse for context-dependent method: only 178 points in 13 dimensions, giving the conductivity too


Agapito Ledezma and Ricardo Aler and Araceli Sanchís and Daniel Borrajo. Empirical Evaluation of Optimized Stacking Configurations. ICTAI. 2004.

2 hepatitis 19 155 77 78 2 hypo 25 3163 2846 317 2 image 19 2310 1848 462 7 ionosphere 34 351 175 176 2 iris 4 150 75 75 3 soya 35 683 341 342 19 vote 16 435 217 218 2 wine 13 178 89 89 3 Table 1. Datasets descriptions . C4.5 [25]. It generates decision trees . A probabilistic Naive Bayes classifier [19] . IBk. This is Aha's instance based learning algorithm [1] . PART [14]. It forms a decision list


Jianbin Tan and David L. Dowe. MML Inference of Oblique Decision Trees. Australian Conference on Artificial Intelligence. 2004.

and medical data, such as Bupa, Breast Cancer, Wisconsin, Lung Cancer, and Cleveland. The nine UCI Repository [1] data-sets used are these five, Balance, Credit, Sonar and Wine For each of the nine data sets, 100 independent tests were done by randomly sampling 90% of the data as training data and testing on the remaining 10%. 4 Discussion We compare the MML oblique tree scheme to C4.5 and C5. The


Sugato Basu. Semi-Supervised Clustering with Limited Background Knowledge. AAAI. 2004.

Experiments were conducted on several datasets from the UCI repository: Iris, Wine and representative randomly sampled subsets from the Pen-Digits and Letter datasets. For Pen-Digits and Letter, we chose two sets of three classes: fI, J, Lg


Stefan Mutter and Mark Hall and Eibe Frank. Using Classification to Evaluate the Output of Confidence-Based Association Rule Mining. Australian Conference on Artificial Intelligence. 2004.

294 6 3 4 2 20.4 iris 150 4 0 0 3 0.0 labor 57 8 3 5 2 35.7 led7 1000 0 7 0 10 0.0 lenses 24 0 0 4 3 0.0 pima 768 8 0 0 2 0.0 tic-tac-toe 958 0 0 9 2 0.0 wine 178 13 0 0 3 0.0 Table 1. The UCI datasets used for the experiments and their properties. In the led7 dataset 10% of the instances are noisy. In every experiment the support threshold s min of Apriori was set to 1% of all instances and the


Jennifer G. Dy and Carla Brodley. Feature Selection for Unsupervised Learning. Journal of Machine Learning Research, 5. 2004.

EM-k-STD (e) Figure 9: Feature selection versus without feature selection on the four-class data. 6.5 Experiments on Real Data We examine the FSSEM variants on the iris, wine and ionosphere data set from the UCI learning repository (Blake and Merz, 1998), and on a high resolution computed tomography (HRCT) lung 867 DY AND BRODLEY image data which we collected from IUPUI medical center (Dy et


Yuan Jiang and Zhi-Hua Zhou. Editing Training Data for kNN Classifiers with Neural Network Ensemble. ISNN (1). 2004.

Size Class annealing 33 5 798 6 credit 9 6 690 2 glass 0 9 214 7 hayes-roth 4 0 132 3 iris 0 4 150 3 liver 0 6 345 2 pima 0 8 768 2 soybean 35 0 683 19 wine 0 13 178 3 zoo 16 0 101 7 On each data set, 10 runs of 10-fold cross validation is performed with random partitions. The e®ects of the editing approaches described in Section 2 are compared through coupling them with a 3NN classifier. The


Mikhail Bilenko and Sugato Basu and Raymond J. Mooney. Integrating constraints and metric learning in semi-supervised clustering. ICML. 2004.

Experiments were conducted on three datasets from the UCI repository: Iris, Wine and Ionosphere (Blake & Merz, 1998); the Protein dataset used by Xing et al. (2003) and Bar-Hillel et al. (2003), and randomly sampled subsets from the Digits


Sugato Basu. Also Appears as Technical Report, UT-AI. PhD Proposal. 2003.

Experiments were conducted on several datasets from the UCI repository: Iris, Wine and representative randomly sampled subsets from the Pen-Digits and Letter datasets. For Pen-Digits and Letter, we chose two sets of three classes: # I, J, L #


Michael L. Raymer and Travis E. Doom and Leslie A. Kuhn and William F. Punch. Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 33. 2003.

for testing the ability of a classifier and feature extractor to maintain or increase classification accuracy while reducing dimensionality when there are fewer features to work with. Wine -- This data set consists of the results of a chemical analysis of wines derived from three different cultivars [37, 38]. There are 13 continuous features, with no missing values. There are 59, 71, and 48 members of


Jeremy Kubica and Andrew Moore. Probabilistic Noise Identification and Data Cleaning. ICDM. 2003.

We also compared the algorithms by their ability to identify artificial corruptions. Three different test sets were used: a noise free version of the rock data described above, the UCI Iris data set, and the UCI Wine data set [3]. Noise was generated by choosing to corrupt each record with some probability p. For each record chosen, corruption and noise vectors were sampled from their


Mukund Deshpande and George Karypis. Using conjunction of attribute values for classification. CIKM. 2002.

7 214 heart 13 0 2 270 hepati 6 13 2 155 horse 7 15 2 368 iris 4 0 3 150 labor 8 8 2 57 led7 0 7 10 3200 lymph 0 18 4 148 pima 8 0 2 768 tic-tac 0 9 2 958 wine 13 0 3 178 zoo 0 16 7 101 Table 1: UCI dataset statistics. We performed our experiments using a 10 way cross validation scheme and computed average accuracy across different runs. We ran our experiments using a support threshold of 1.0% for all


Petri Kontkanen and Jussi Lahtinen and Petri Myllymaki and Tomi Silander and Henry Tirri. Proceedings of Pre- and Post-processing in Machine Learning and Data Mining: Theoretical Aspects and Applications, a workshop within Machine Learning and Applications. Complex Systems Computation Group (CoSCo). 1999.

8124 23 2 Postoperative Patient 90 9 3 Thyroid Disease 215 6 3 Tic-Tac-Toe Endgame 958 10 2 Vehicle Silhouettes 846 19 4 Congressional Voting Records 435 17 2 Wine Recognition 178 14 3 Table 1: The datasets used in the experiments. For estimating the quality of the visualizations produced, we used the validation scheme described in the previous section. The prediction methods used are listed in Table


Ethem Alpaydin. Voting over Multiple Condensed Nearest Neighbors. Artif. Intell. Rev, 11. 1997.

accuracy goes higher but the variance also decreases. This indicates better generalization and is the clear advantage of voting. Complete results are given in Table 4. Results for the IRIS and WINE datasets are similar and are omitted. When one increases the number of voting subsets, after a certain number, new subsets do not contribute much. Whether an additional subset pays off the additional


Georg Thimm and E. Fiesler. Optimal Setting of Weights, Learning Rate, and Gain. E S E A R C H R E P R O R T I D I A P. 1997.

Multilayer perceptrons behave similarly, as shown in figure 4, as confirmed by experiments performed with the Solar, Wine Glass and Servo data sets. The most important difference with high order perceptrons is that the networks do not or only very slowly converge for weight variances close to zero. Such variances should therefore not be used


Pedro Domingos. Unifying Instance-Based and Rule-Based Induction. Machine Learning, 24. 1996.

included in the listing of empirical results in (Holte, 1993) are referred to by the same codes. In the first phase of the study, the first 15 datasets in Table 4 (from breast cancer to wine were used to fine-tune the algorithms, choosing by 10-fold cross-validation the most accurate version of each. Since a complete factor analysis would be too


Kamal Ali and Michael J. Pazzani. Error Reduction through Learning Multiple Descriptions. Machine Learning, 24. 1996.

the efficacy of using multiple models. It is important to analyze these experimental data because the amount of error reduction obtained by using multiple models varies a great deal. On the wine data set, for example, the error obtained by uniformly weighted voting between eleven, stochastically-generated descriptions is only one seventh that of the error obtained by using a single description. On


Georg Thimm and Emile Fiesler. IDIAP Technical report High Order and Multilayer Perceptron Initialization. IEEE Transactions. 1994.

itself has a large influence on the optimal initial weight variance: for the solar, wine and servo data sets, the networks have about the same size for the same order, but the optimal value for the weight variance differs a lot for the network with the logistic 11 0.01 0.1 1 10E-4 0.001 10E-5 10E-6


Wl/odzisl/aw Duch. Coloring black boxes: visualization of neural network decisions. School of Computer Engineering, Nanyang Technological University.

a linear projection method is introduced, projecting the network outputs into K vertices of a polygon. Section three presents a detailed case study using an MLP and RBF networks for the 3class Wine dataset, and some examples for 5-class Satimage dataset. In the last section discussion and some remarks on the usefulness and further development of such visualization methods are given. Since the use of


H. Altay Guvenir. A Classification Learning Algorithm Robust to Irrelevant Features. Bilkent University, Department of Computer Engineering and Information Science.

VFI5 1NN 3NN 5NN 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of irrelevant features added 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Classification accuracy Wine data set VFI5 1NN 3NN 5NN Fig. 7. The comparison of the average classification accuracies for kNN and VFI5 on some of UCI-Repository data sets with increasing number of artificially added irrelevant


Christian Borgelt and Rudolf Kruse. Speeding Up Fuzzy Clustering with Neural Network Techniques. Research Group Neural Networks and Fuzzy Systems Dept. of Knowledge Processing and Language Engineering, School of Computer Science Otto-von-Guericke-University of Magdeburg.

the number of clusters to find (abalone: 3, breast: 2, iris: 3, wine 3), so we ran the clustering using these numbers. In addition, we ran the algorithm with 6 clusters for the abalone and the wine dataset. The clustering process was terminated when a normal update step changed no center coordinate by more than 10 6 . That is, regardless of the modification employed, we used the normal update step to


Denver Dash and Gregory F. Cooper. Model Averaging with Discrete Bayesian Network Classifiers. Decision Systems Laboratory Intelligent Systems Program University of Pittsburgh.

variable is considered to be the "positive" state; therefore, the scores in Table 5 are average scores for all ROC curves associated with a particular classification variable; therefore some data sets (e.g., wine have no zero entries when two or more classifiers score highest on di®erent curves. We have underlined the top two scoring classifiers for each data set to emphasize the fact that AMA


Ping Zhong and Masao Fukushima. Second Order Cone Programming Formulations for Robust Multi-class Classification.

problem as follows: max ®,¾,¿ e T ®- (¾ + ¿) s.t. ¯ E T ® = 0, ® · (1 - º)e, (38) ¾ - ¿ = º, ° ° ° ° ° ° 2 4 - 1 p 2(K+1) ~ A T ® ¿ 3 5 ° ° ° ° ° ° · ¾. Table 1: Description of Iris, Wine and Glass datasets. name dimension (N) #classes (K) #examples (L) Iris 4 3 150 Wine 13 3 178 Glass 9 6 214 14 Table 2: Results for Iris, Wine and Glass datasets with noise (½ = 0.3, · = 2, º = 0.05). R a Robust (I)


Aynur Akku and H. Altay Guvenir. Weighting Features in k Nearest Neighbor Classification on Feature Projections. Department of Computer Engineering and Information Science Bilkent University.

except k = 1. There were no significant difference between the two weight learning algorithms on the wine dataset. 7 5 Conclusions A version of the famous k-NN algorithm, that stores the classification knowledge as the projections of the training instances on the features, called k-NNFP algorithm, had been


C. Titus Brown and Harry W. Bullen and Sean P. Kelly and Robert K. Xiao and Steven G. Satterfield and John G. Hagedorn and Judith E. Devaney. Visualization and Data Mining in an 3D Immersive Environment: Summer Project 2003.

any patterns in the RGB values of the roads and neighboring terrain due to significant rounding of the RGB values. 31 Figure 4.7: The two-color graph for all clear road images. 32 4.8 Wine The Wine data set was analysed by Harry Bullen. This data set contains the chemical analysis of wines from Italy. Wines from three different types of grapes are included. There are levels of 13 chemicals provided for


Stefan Aeberhard and Danny Coomans and De Vel. THE PERFORMANCE OF STATISTICAL PATTERN RECOGNITION METHODS IN HIGH DIMENSIONAL SETTINGS. James Cook University.

means coincide. FDP performed very well for the exponential data. The results of the real data support the observations made from the simulations. FDP does not perform very well on well-defined data sets wine data, Iris data), especially when compared to FF. It however compares somewhat better in the other cases, most noticeably in the case of the tertiary institutions data, where it equals the


Pramod Viswanath and M. Narasimha Murty and Shalabh Bhatnagar. Partition Based Pattern Synthesis Technique with Efficient Algorithms for Nearest Neighbor Classification. Department of Computer Science and Automation, Indian Institute of Science.

We performed experiments with five different datasets, viz., OCR, WINE VOWEL, THYROID, GLASS and PENDIGITS, respectively. Except the OCR dataset, all others are from the UCI Repository [19]. OCR dataset is also used in [20,18]. The properties of the


Yin Zhang and W. Nick Street. Bagging with Adaptive Costs. Management Sciences Department University of Iowa Iowa City.

[2]: Autompg, Bupa, Glass, Haberman, Housing, Cleveland-heart-disease, Hepatitis, Ion, Pima, Sonar, Vehicle, WDBC, Wine and WPBC. Some of the data sets do not originally depict two-class problems so we did some transformation on the dependent variables to get binary class labels. Specifically in our experiments, Autompg data is labeled by whether


Daichi Mochihashi and Gen-ichiro Kikui and Kenji Kita. Learning Nonstructural Distance Metric by Minimum Cluster Distortions. ATR Spoken Language Translation research laboratories.

to get a small increase in precision like the document retrieval experiment in section 5.2. 0 . 6 0 . 7 0 . 8 0 . 9 1 1 2 5 1 0 1 3 D i m e n s i o n P r e c i s i o n (a) wine dataset 0 . 6 0 . 7 0 . 8 0 . 9 1 2 5 1 0 1 5 2 0 D i m e n s i o n P r e c i s i o n (b) "protein" dataset 0 . 7 0 . 8 0 . 9 1 1 2 3 4 D i m e n s i o n P r e c i s i o n (c) "iris" dataset 0 . 6 0 . 7 0 .


Abdelhamid Bouchachia. RBF Networks for Learning from Partially Labeled Data. Department of Informatics, University of Klagenfurt.

data points. Once selected, these data points and the given labeled data points are used to train the neural network. 13 4. Numerical Evaluation To evaluate the approach presented here, two data sets are used: the cancer and the wine data set (Hettich et al., 1998). The cancer data consists of 683 instances with 9 features, while the wine data set consists of 178 instances with 13 features.


K. A. J Doherty and Rolf Adams and Neil Davey. Unsupervised Learning with Normalised Data and Non-Euclidean Norms. University of Hertfordshire.

considered were the Ionosphere, Image Segmentation (training data), Wisconsin Diagnostic Breast Cancer (WDBC) and Wine data sets. These data sets were selected to show our approach on data with a range of classes, dimensionality and data distributions. The basic characteristics of each data set are shown in table 2. Tab l e


Erin J. Bredensteiner and Kristin P. Bennett. Multicategory Classification by Support Vector Machines. Department of Mathematics University of Evansville.

methods. The kernel function for the piecewise-nonlinear M-SVM and k-SVM methods is K(x, x i ) = x·x i n + 1 # d , where d is the degree of the desired polynomial. wine Recognition Data The Wine dataset [1] uses the chemical analysis of wine to determine the cultivar. There are 178 points with 13 features. This is a three class dataset distributed as follows: 59 points in class 1, 71 points in


Stefan Aeberhard and O. de Vel and Danny Coomans. New Fast Algorithms for Variable Selection based on Classifier Performance. James Cook University.

taken to add or eliminate half of the variables. complexities from O(d 4 ) to O(d 3 ). The time complexities of the optimal algorithms for QDA and LDA based variable selection are O(N p d 2 ). Two data sets were used. The first, the wine data [11], is 13 dimensional with three classes and 59, 71 and 48 objects per class. The classes correspond to three different cultivars and the 13 variables measure


Georg Thimm and Emile Fiesler. High Order and Multilayer Perceptron Initialization.

itself has a large influence on the optimal initial weight variance: for the solar, wine and servo data sets, the networks have about the same size for the same order, but the optimal value for the weight variance differs a lot for the network with the logistic activation function. Further, the optimal


Pramod Viswanath and M. Narasimha Murty and Shalabh Bhatnagar. A pattern synthesis technique to reduce the curse of dimensionality effect. E-mail.

We performed experiments with five different datasets, viz., OCR, WINE THYROID, GLASS and PENDIGITS, respectively. Except the OCR dataset, all others are from the UCI Repository [16]. OCR dataset is also used in [17, 18]. The properties of the


Chih-Wei Hsu and Cheng-Ru Lin. A Comparison of Methods for Multi-class Support Vector Machines. Department of Computer Science and Information Engineering National Taiwan University.

section we present experimental results on several problems from the Statlog collection [20] and the UCI Repository of machine learning databases [1]. From UCI Repository we choose the following datasets: iris, wine glass, and vowel. Those problems had already been tested in [27]. From Statlog collection we choose all multi-class datasets: vehicle, segment, dna, satimage, letter, and shuttle. Note


Petri Kontkanen and Jussi Lahtinen and Petri Myllymaki and Tomi Silander and Henry Tirri. USING BAYESIAN NETWORKS FOR VISUALIZING HIGH-DIMENSIONAL DATA. Complex Systems Computation Group (CoSCo).

8124 23 2 PostoperativePatient 90 9 3 Thyroid Disease 215 6 3 Tic-Tac-Toe Endgame 958 10 2 Vehicle Silhouettes 846 19 4 Congressional Voting Records 435 17 2 Wine Recognition 178 14 3 Table 1: The datasets used in the experiments. For estimating the quality of the visualizations produced, we used the validation scheme described in the previous section. The prediction methods used are listed in Table


Perry Moerland and E. Fiesler and I. Ubarretxena-Belandia. Incorporating LCLV Non-Linearities in Optical Multilayer Neural Networks. Preprint of an article published in Applied Optics.

have been used, namely the sonar benchmark [13] and the wine data set [14]: Sonar This data set was originally used by R. Gorman and T. Sejnowski in their study of the classification of sonar signals using a neural network. The task is to discriminate between sonar


Matthias Scherf and W. Brauer. Feature Selection by Means of a Feature Weighting Approach. GSF - National Research Center for Environment and Health.

0 Monks1 124 0 6 Monks2 169 0 6 Monks3 122 0 6 Parity5+5 100 0 10 Vowel 528+462 10 0 Wisconsin Breast Cancer 569 30 0 Pima-Diabetes 768 8 0 Liver Disorders 745 6 0 Wine 178 13 0 Table 1. Summary of data sets used reason for the choice of this special RBF network is its ability to automatically determine the number of base functions which is a measure of classifier complexity. 6.1 Critical Parameters


Return to Wine data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML