Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Iris Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.

Return to Iris data set page.


Ping Zhong and Masao Fukushima. A Regularized Nonsmooth Newton Method for Multi-class Support Vector Machines. 2005.

the starting point of the next (k + 1)th iteration. The parameters º 1 and º 2 in (3) are both set 0.01. In Algorithm 3.1, we replaced the standard Armijo-rule in (S.3) by 10 Table 1: Six benchmark datasets from UCI name iris wine glass vowel vehicle segment #pts 150 178 214 528 846 2310 {fiats|flats} 4 13 9 10 18 19 #cls 3 3 6 11 4 7 #pts: the number of training data; {fiats|flats}: the number of


Anthony K H Tung and Xin Xu and Beng Chin Ooi. CURLER: Finding and Visualizing Nonlinear Correlated Clusters. SIGMOD Conference. 2005.

of three helix clusters with different cluster existence spaces, the iris plant dataset and the image segmentation dataset from the UCI Repository of Machine Learning Databases and Domain Theories [6], and the Iyer time series gene expression data with 10 well-known linear clusters


Igor Fischer and Jan Poland. Amplifying the Block Matrix Structure for Spectral Clustering. Telecommunications Lab. 2005.

are common benchmark sets with real-world data (Murphy & Aha, 1994): the iris the wine and the breast cancer data set. Both our methods perform very well on iris and breast cancer. However, the wine data set is too sparse for context-dependent method: only 178 points in 13 dimensions, giving the conductivity too


Sotiris B. Kotsiantis and Panayiotis E. Pintelas. Logitboost of Simple Bayesian Classifier. Informatica. 2005.

were hand selected so as to come from real-world problems and to vary in characteristics. Thus, we have used data sets from the domains of: pattern recognition iris zoo), image recognition (ionosphere, sonar), medical diagnosis (breast-cancer, breast-w, colic, diabetes, heart-c, heart-h, heart-statlog, hepatitis,


Manuel Oliveira. Library Release Form Name of Author: Stanley Robson de Medeiros Oliveira Title of Thesis: Data Transformation For Privacy-Preserving Data Mining Degree: Doctor of Philosophy Year this Degree Granted. University of Alberta Library. 2005.

(d o =18,d r = 12). 119 7.13 Average of F-measure (10 trials) for the Iris dataset (d o =5,d r =3)......120 7.14 An example of partitioning for the Pumsb dataset. . . . . . . . . . . . . . . . 120 7.15 Average of F-measure (10 trials) for the Pumsb dataset over vertically


Jennifer G. Dy and Carla Brodley. Feature Selection for Unsupervised Learning. Journal of Machine Learning Research, 5. 2004.

EM-k-STD (e) Figure 9: Feature selection versus without feature selection on the four-class data. 6.5 Experiments on Real Data We examine the FSSEM variants on the iris wine, and ionosphere data set from the UCI learning repository (Blake and Merz, 1998), and on a high resolution computed tomography (HRCT) lung 867 DY AND BRODLEY image data which we collected from IUPUI medical center (Dy et


Jeroen Eggermont and Joost N. Kok and Walter A. Kosters. Genetic Programming for data classification: partitioning the search space. SAC. 2004.

is disappointing as only our clustering gp algorithm with 3 clusters per numerical valued attribute manages to really outperform our simple gp but still performs much worse than C4.5. The Iris Data Set If we look at the results of our gp algorithms on the Iris data set in Table 8 we see that by far the best performance is achieved by our clustering gp algorithm with 3 clusters per numerical valued


Remco R. Bouckaert and Eibe Frank. Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms. PAKDD. 2004.

perform differently in 19 out of 27 cases. For some rows, the test consistently indicates no difference between any two of the three schemes, in particular for the iris and Hungarian heart disease datasets. However, most rows contain at least one cell where the outcomes of the test are not consistent. The row labeled "consistent" at the bottom of the table lists the number of datasets for which all


Mikhail Bilenko and Sugato Basu and Raymond J. Mooney. Integrating constraints and metric learning in semi-supervised clustering. ICML. 2004.

Experiments were conducted on three datasets from the UCI repository: Iris Wine, and Ionosphere (Blake & Merz, 1998); the Protein dataset used by Xing et al. (2003) and Bar-Hillel et al. (2003), and randomly sampled subsets from the Digits


Qingping Tao Ph. D. MAKING EFFICIENT LEARNING ALGORITHMS WITH EXPONENTIALLY MANY FEATURES. Qingping Tao A DISSERTATION Faculty of The Graduate College University of Nebraska In Partial Fulfillment of Requirements. 2004.

(T 0 = n 2 and T s =10n 2 ). M - Metropolis, G - Gibbs, MG - Metropolized Gibbs, PT - Parallel Tempering, BF - Brute Force. Data Sets iris car breast cancer voting auto annealing n 4 6 9 16 25 38 M 5.3 ± 2.1 1.7 ± 0.831.5 ± 5.05.0± 2.1 12.8 ± 7.5 1.0 ± 0.7 G 6.7 ± 3.81.9 ± 0.8 30.9 ± 5.5 5.0 ± 2.415.6 ± 7.80.6 ± 0.5 MG 6.0 ± 1.7


Yuan Jiang and Zhi-Hua Zhou. Editing Training Data for kNN Classifiers with Neural Network Ensemble. ISNN (1). 2004.

i.e. glass, hayes-roth and wine. It is surprising that Depuration obtains the best performance on only one data set, i.e. iris as RelabelOnly does. These observations indicate that NNEE is a better editing approach than Depuration. Moreover, since the e®ect of Depuration is only comparable to that of


Sugato Basu. Semi-Supervised Clustering with Limited Background Knowledge. AAAI. 2004.

like stop-word removal, tf-idf weighting, and removal of very high-frequency and very low-frequency words (Dhillon & Modha, 2001). From the UCI collection we selected Iris which is a well-known dataset having 150 points in 4 dimensions. We used the active pairwise constrained version of KMeans on Iris, and SPKMeans on Classic3-subset. Learning curves with cross validation For all algorithms on


Judith E. Devaney and Steven G. Satterfield and John G. Hagedorn and John T. Kelso and Adele P. Peskin and William George and Terence J. Griffin and Howard K. Hung and Ronald D. Kriz. Science at the Speed of Thought. Ambient Intelligence for Scientific Discovery. 2004.

EXAMPLES Figure 1 shows part of our visualization of the Iris data set [2]. (The full visualization contains multiple rooms with an alternate visualization of the same data set in each room, enabling a scientist to visit each of the rooms.) On the near side of the left


Eibe Frank and Mark Hall. Visualizing Class Probability Estimators. PKDD. 2003.

<= 1.7 iris virginica (46.0/1.0) } 1.7 Iris-versicolor (48.0/1.0) <= 4.9 petalwidth } 4.9 Iris-virginica (3.0) <= 1.5 Iris-versicolor (3.0/1.0) } 1.5 Fig. 5. The decision tree for the two-class iris dataset. (a) (b) (c) Fig. 6. Visualizing the decision tree for the two-class iris data using (a) petallength and petalwidth, (b) petallength and sepallength, and (c) sepallength and sepalwidth (with the


Ross J. Micheals and Patrick Grother and P. Jonathon Phillips. The NIST HumanID Evaluation Framework. AVBPA. 2003.

Jonathon's signature therefore contains five sigmembers: one for the iris scan, three for each facial image, and one for the gait video. For the first sigmember, the iris scan, there is a single dataset with a single file that contains the iris data. Three sigmembers, for the facial imagery, each have a single dataset, each with a single file that each contain a facial image. The fifth sigmember,


Sugato Basu. Also Appears as Technical Report, UT-AI. PhD Proposal. 2003.

like stop-word removal, tf-idf weighting, and removal of very high-frequency and very low-frequency words (Dhillon & Modha, 2001). From the UCI collection we selected Iris which is a well-known dataset having 150 points in 4 dimensions. We used the active pairwise constrained version of KMeans on Iris, and SPKMeans on Classic3-subset. Learning curves with cross validation For all algorithms on


Dick de Ridder and Olga Kouropteva and Oleg Okun and Matti Pietikäinen and Robert P W Duin. Supervised Locally Linear Embedding. ICANN. 2003.

retained in the remaining M dimensions [3]. This local intrinsic dimensionality estimate is denoted by ML . The feature extraction process is illustrated in Figure 1: the C = 3 classes in the iris data set [1] are mapped onto single points by 1-SLLE. #-SLLE retains some of the class structure, but reduces within-class dispersion compared to LLE. Clearly, SLLE is suitable as a feature extraction step


Aristidis Likas and Nikos A. Vlassis and Jakob J. Verbeek. The global k-means clustering algorithm. Pattern Recognition, 36. 2003.

it is also possible to employ the above presented k-d tree approach with the global k-means algorithm. 4 Experimental results We have tested the proposed clustering algorithms on several well-known data sets, namely the iris data set [8], the synthetic data set [9] and the image segmentation data set [8]. In all data sets we conducted experiments for the clustering problems obtained by considering only


Zhi-Hua Zhou and Yuan Jiang and Shifu Chen. Extracting symbolic rules from trained neural network ensembles. AI Commun, 16. 2003.

80 2 19 13 6 iris plant iris 150 3 4 0 4 statlog australian credit approval credit-a 690 2 15 9 6 statlog german credit credit-g 1,000 2 20 13 7 Table 2 Fidelity of rules extracted via REFNE data set balance voting hepatitis iris credit-a credit-g average fidelity 87.88% 89.26% 84.50% 96.25% 84.13% 74.10% 86.02% Table 3 Comparison of generalization error data set REFNE ensemble single NN C4.5


Jeremy Kubica and Andrew Moore. Probabilistic Noise Identification and Data Cleaning. ICDM. 2003.

We also compared the algorithms by their ability to identify artificial corruptions. Three different test sets were used: a noise free version of the rock data described above, the UCI Iris data set, and the UCI Wine data set [3]. Noise was generated by choosing to corrupt each record with some probability p. For each record chosen, corruption and noise vectors were sampled from their


Julie Greensmith. New Frontiers For An Artificial Immune System. Digital Media Systems Laboratory HP Laboratories Bristol. 2003.

using the g++ compiler version 2.96 for Red Hat Linux 7.3 2.96-113, and was run on one out of 4 of Intel Pentium fiff 4 CPU 1.80GHz HP `e-PC's'. On completion of the compilation process, the iris dataset (provided with the source code) was used to perform preliminary testing on the system. Once it was clear on how to use the various parameter settings, and that classification could be performed,


Manoranjan Dash and Huan Liu and Peter Scheuermann and Kian-Lee Tan. Fast hierarchical clustering and its validation. Data Knowl. Eng, 44. 2003.

consists of 10,992 objects in 16 dimensions. There are 10 classes corresponding to digits 0...9. The 16 dimensions are drawn by re-sampling from handwritten digits. Iris dataset has 150 points in 4 dimensions in 3 clusters. Dimensions are sepal length, sepal width, petal length, and petal width. Clusters are Iris Setosa, Iris Versicolour, and Iris Virginia. Each of the 3


Bob Ricks and Dan Ventura. Training a Quantum Neural Network. NIPS. 2003.

an epoch refers to finding and fixing the weight of a single node. We also tried the randomized search algorithm for a few real-world machine learning problems: lenses, Hayes-Roth and the iris datasets [19]. The lenses data set is a data set that tries to predict whether people will need soft contact lenses, hard contact lenses or no contacts. The iris dataset details features of three different


Jun Wang and Bin Yu and Les Gasser. Concept Tree Based Clustering Visualization with Shaded Similarity Matrices. ICDM. 2002.

we will briefly show how shaded similarity matrices are constructed and how one looks through an example. The data used in the example is part of the Iris data from the UCI repository[9]. The Iris data set contains 150 instances, evenly distributed in 3 classes. We fetch 5 instances from each class, and thus obtain 15 instances (Table 1). The similarity matrix was computed based on Euclidean distance


Michail Vlachos and Carlotta Domeniconi and Dimitrios Gunopulos and George Kollios and Nick Koudas. Non-linear dimensionality reduction techniques for classification and visualization. KDD. 2002.

used in our experiments Dataset ] data ] dims ] classes experiment Iris 100 4 2 leave 1 out c-v Sonar 208 60 2 leave 1 out c-v Glass 214 9 6 leave 1 out c-v Liver 345 6 2 leave 1 out c-v Lung 32 56 3 leave 1 out c-v Image 640 16


Geoffrey Holmes and Bernhard Pfahringer and Richard Kirkby and Eibe Frank and Mark A. Hall. Multiclass Alternating Decision Trees. ECML. 2002.

90.49 89.72 labor 84.67 87.5 + promoters 86.8 87.3 sick-euthyroid 97.71 97.85 + sonar 76.65 74.12 vote 96.5 96.18 +, statistically significant difference Table 3. Wrapping two-class ADTree results dataset 1vs1 1vsRest Random Exhaustive iris 95.13 95.33 95.33 95.33 balance-scale 83.94 85.06 + 85.06 + 85.06 + hypothyroid 99.61 99.63 99.64 99.64 anneal 99.01 98.96 99.05 99.19 + zoo 90.38 93.45 + 95.05 +


Inderjit S. Dhillon and Dharmendra S. Modha and W. Scott Spangler. Class visualization of high-dimensional data with applications. Department of Computer Sciences, University of Texas. 2002.

sketch the outline of the paper. Section 2 introduces class-preserving projections and class-eigenvector plots, and contains several illustrations of the Iris plant and ISOLET speech recognition data sets [27]. Class-similarity graphs and class tours are discussed in Sections 3 and 4. We illustrate the value of the above visualization tools in Section 5, where we present a detailed study of the


Manoranjan Dash and Kiseok Choi and Peter Scheuermann and Huan Liu. Feature Selection for Clustering - A Filter Solution. ICDM. 2002.

are almost correct as well as the selected features are all important and it missed out only one important feature. 5.2 Benchmark and Real Datasets Iris dataset, popularly used for testing clustering and classification algorithms, is taken from UCI ML repository [5]. It contains 3 classes of 50 instances each, where each class refers to a type


Ayhan Demiriz and Kristin P. Bennett and Mark J. Embrechts. A Genetic Algorithm Approach for Semi-Supervised Clustering. E-Business Department, Verizon Inc.. 2002.

506 points), House Votes (16 variables, 435 points), Breast Cancer Diagnostic (30 variables, 569 points), Pima Diabetes ( 8 variables, 769 points), and Iris ( 4 variables, 150 points). The datasets have categorical dependent variables except Housing. The continuous dependent variable for this dataset was categorized at the level of 21.5. Iris is a three class problem. The other datasets are


Wai Lam and Kin Keung and Charles X. Ling. PR 1527. Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong. 2001.

from di3erent real-world application in various domains, such as the city-cycle fuel consumption (Am), Wisconsin breast cancer (Bc) and the 43 famous iris plant database (Ir). Table 1 shows the data sets and their corresponding code used in this paper. 45 For each data set, we randomly partitioned the data into ten even portions. Ten trials derived from 10-fold 47 cross-validation were conducted


Jinyan Li and Guozhu Dong and Kotagiri Ramamohanarao and Limsoon Wong. DeEPs: A New Instance-based Discovery and Classification System. Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases. 2001.

we highlight some interesting points. 1. DeEPs versus k-NN. ffl Both DeEPs and k-NN perform equally accurately on soybean-small (100%) and on iris (96%). ffl DeEPs wins on 26 data sets; k-NN wins on 11. It can be seen that the accuracy of DeEPs is generally better than that of k-NN. ffl The speed of DeEPs is about 1.5 times slower than that of k-NN. The main reason is that DeEPs


David Hershberger and Hillol Kargupta. Distributed Multivariate Regression Using Wavelet-Based Collective Data Mining. J. Parallel Distrib. Comput, 61. 2001.

Application of this method to Linear Discriminant Analysis, which is related to parametric multivariate regression, produced classification results on the Iris data set that are comparable to those obtained with centralized data analysis. Key Words: data mining, distributed data mining, collective data mining, knowledge discovery, wavelets, regression 1.


David Horn and A. Gottlieb. The Method of Quantum Clustering. NIPS. 2001.

minima appear, as seen in Fig. 3. Nonetheless, they lie high and contain only a few data points. The major minima are the same as in Fig. 2. 3.2 iris Data Our second example consists of the iris data set [10], which is a standard benchmark obtainable from the UCI repository [11]. Here we use the first two principal components to define the two dimensions in which we apply our method. Fig. 4, which


Asa Ben-Hur and David Horn and Hava T. Siegelmann and Vladimir Vapnik. A Support Vector Method for Clustering. NIPS. 2000.

the core regions by an SV method with a global optimal solution. We have found examples where a local maximum is hard to identify by Roberts' method. 3.2 The iris data We ran SVC on the iris data set [9], which is a standard benchmark in the pattern recognition literature. It can be obtained from the UCI repository [10]. The data set contains 150 instances, each containing four measurements of


Neil Davey and Rod Adams and Mary J. George. The Architecture and Performance of a Stochastic Competitive Evolutionary Neural Tree Network. Appl. Intell, 12. 2000.

5 and 6 are illustrated in Figures 2 and 3. The IRIS data set is included to provide a benchmark performance. Set 1 2-D single source Gaussian cluster, zero mean and unit variance. Simple cluster, base line test. Set 2 20-D single source Gaussian cluster, zero


Edgar Acuna and Alex Rojas. Ensembles of classifiers based on Kernel density estimators. Department of Mathematics University of Puerto Rico. 2000.

has been developed to carry out all our tasks. The results are shown in the table 7. Table 6. Comparison of Bagging using classical and adaptive kernel classifiers Classical Kernel Adaptive Kernel Dataset Single Bagged Improv Single Bagged Improv Iris 4.00 3.33 16.75 4.67 4.00 14.34 Glass 44.97 40.52 9.90 35.20 33.25 5.54 Heart-C 22.09 20.05 9.23 23.60 19.80 16.10 Breast-W 4.34 4.10 5.53 4.88 4.53


Manoranjan Dash and Huan Liu. Feature Selection for Clustering. PAKDD. 2000.

in Figure 3. The X-axis of the plots is for number of most important features and Y -axis is for tr(P Gamma 1 W PB ) value for the corresponding subset of most important features. For Iris data set trace value was the maximum for the two most important features. For D3C, D4C and D6C data trace value increases with addition of important features in a fast rate but slows down to almost a halt


Carlotta Domeniconi and Jing Peng and Dimitrios Gunopulos. An Adaptive Metric Machine for Pattern Classification. NIPS. 2000.

used were taken from the UCI Machine Learning Database Repository [10], except for the unreleased image data set. They are: 1. Iris data. This data set consists of q = 4 measurements made on each of N = 100 iris plants of J = 2 species; 2. Sonar data. This data set consists of q = 60 frequency measurements


David M J Tax and Robert P W Duin. Support vector domain description. Pattern Recognition Letters, 20. 1999.

almost Gaussian distributed and class 2 is scattered around it. The SVDD cannot distinguish one class 2 object from class 1. Finally, the performance of the outlier methods are applied on the iris dataset. Here, all methods work reasonably well, which indicates that the data distributions of the classes are well clustered. Only the Parzen density estimation slightly overtrains. From these results we


Ismail Taha and Joydeep Ghosh. Symbolic Interpretation of Artificial Neural Networks. IEEE Trans. Knowl. Data Eng, 11. 1999.

and universal approach. A rule evaluation technique that orders extracted rules based on three performance measures is then proposed. The three techniques are applied to the iris and breast cancer data sets. The extracted rules are evaluated qualitatively and quantitatively, and compared with those obtained by other approaches. Index Terms: rule extraction, hybrid systems, knowledge refinement, neural


Stephen D. Bay. Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets. ICML. 1998.

comparison, we used the Wilcoxon signed rank test and found that MFS1 and MFS2 were significantly better than all others with a confidence level greater than 99%. MFS only performed poorly on two datasets: Iris and Tic-Tac-Toe. For Iris, both MFS1 and MFS2 gave the lowest accuracy out of all the classifiers. This can possibly be explained by the small number of features in the Iris dataset. With


Wojciech Kwedlo and Marek Kretowski. Discovery of Decision Rules from Databases: An Evolutionary Approach. PKDD. 1998.

Features Examples Classes australian 15 (9 nominal) 690 2 diabetes 8 768 2 german 20 (13 nominal) 1000 2 glass 9 214 7 hepatitis 19 (13 nominal) 155 2 iris 4 150 3 Table 1. Description of the datasets used in the experiments. Dataset Majority C4.5 EDRL fi australian 55.5 85:3 Sigma 0:2 86:1 Sigma 0:4 0.05 diabetes 65.1 74:6 Sigma 0:3 77:9 Sigma 0:3 0.2 german 70.0 71:6 Sigma 0:3 70:1 Sigma


Foster J. Provost and Tom Fawcett and Ron Kohavi. The Case against Accuracy Estimation for Comparing Induction Algorithms. ICML. 1998.

we often do not know whether the existing distribution is the natural distribution, or whether it has been stratified. The iris data set has exactly 50 instances of each class. The splice junction data set (DNA) has 50% donor sites, 25% acceptor sites and 25% nonboundary sites, even though the natural class distribution is very


Igor Kononenko and Edvard Simec and Marko Robnik-Sikonja. Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF. Appl. Intell, 7. 1997.

We compared the performance of the algorithms also on the following non-medical real world data sets (SOYB, IRIS and VOTE are obtained from the Irvine database[21], SAT is obtained from the StatLog database [18]): SOYB: The famous soybean data set used by Michalski & Chilausky [17]. IRIS: The


Prototype Selection for Composite Nearest Neighbor Classifiers. Department of Computer Science University of Massachusetts. 1997.

Fitness--Feature Selection. : : : : : : : : : : : : : : : : 116 4.10 Relationships between component accuracy and diversity for the Monks-2, Breast Cancer Ljubljana, Diabetes and Iris Plants data sets for the four boosting algorithms. "c" represents the Coarse Reclassification algorithm; "d", Deliberate Misclassification; "f ", Composite Fitness; and "s" Composite Fitness--Feature Selection. : :


Ke Wang and Han Chong Goh. Minimum Splits Based Discretization for Continuous Features. IJCAI (2). 1997.

but never explored multi-way split of a continuous feature, making the simple structure disappear. Consider the following two decision trees built in one of the 10fold cross validation on Iris dataset. The first tree is produced by the multi-way split proposed in this paper, and the second by C4.5. Though both trees have the same size and same error rate on test data, the first tree classifies


Ethem Alpaydin. Voting over Multiple Condensed Nearest Neighbors. Artif. Intell. Rev, 11. 1997.

accuracy goes higher but the variance also decreases. This indicates better generalization and is the clear advantage of voting. Complete results are given in Table 4. Results for the IRIS and WINE datasets are similar and are omitted. When one increases the number of voting subsets, after a certain number, new subsets do not contribute much. Whether an additional subset pays off the additional


Tapio Elomaa and Juho Rousu. Finding Optimal Multi-Splits for Numerical Attributes in Decision Tree Learning. ESPRIT Working Group in Neural and Computational Learning. 1996.

used. Data set Examples Attributes Classes Num. Total Iris plant classification 150 4 4 3 Glass type identification 214 9 9 6 Australian credit card assessment 690 6 14 2 Wisconsin breast cancer data 699 9 9 2


Ron Kohavi. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. KDD. 1996.

easy to understand when the log probabilities were presented as evidence that adds up in favor of different classes. Figure 1 shows a visualization of the Naive-Bayes classifier for Fisher's iris data set, where the task is to determine the type of iris based on four attributes. Each bar represents evidence for a given class and attribute value. Users can immediately see that all values for


Daniel C. St and Ralph W. Wilkerson and Cihan H. Dagli. RULE SET QUALITY MEASURES FOR INDUCTIVE LEARNING ALGORITHMS. proceedings of the Artificial Neural Networks In Engineering Conference 1996 (ANNIE. 1996.

distribution of the 148 instances among the four classes "normal" with 2 instances, "metastases" with 81 instances, "malign" with 61 instances, and "fibrosis" with 4 instances. The Iris data set, developed by R. A. Fisher (1936), lists the measurements of four characteristics of Iris flowers: petal length, petal width, sepal length, and sepal width. The set includes the measurements of 50


Ron Kohavi. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. IJCAI. 1995.

then an over-represented class in one subset will be a underrepresented in the other. To demonstrate the issue, we simulated a 2/3, 1/3 split of Fisher's famous iris dataset and used a majority inducer that builds a classifier predicting the prevalent class in the training set. The iris dataset describes iris plants using four continuous features, and the task is to


Ron Kohavi. The Power of Decision Tables. ECML. 1995.

other class is the more prevalent in the training set and the majority inducer predicts the wrong label for the test instance. We have observed a similar phenomenon even with ten-fold CV. The iris dataset has 150 instances, 50 of each class. Predicting any class would yield 33.3% accuracy, but ten-fold CV using a majority induction algorithm yields 21.5% accuracy (averaged over 100 runs of ten-fold


George H. John and Ron Kohavi and Karl Pfleger. Irrelevant Features and the Subset Selection Problem. ICML. 1994.

performance was on parity5+5 and CorrAL using stepwise backward elimination, which reduced the error to 0% from 50% and 18.8% respectively. Experiments were also run on the Iris Thyroid, and Monk1* datasets. The results on these datasets were similar to those reported in this paper. We observed high variance in the 25-fold crossvalidation estimates of the error. Since our algorithms depend on


Zoubin Ghahramani and Michael I. Jordan. Learning from incomplete data. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES. 1994.

stochastic estimator. 4.4 Classification Classification with missing inputs 0 20 40 60 80 100 20 40 60 80 100 % missing features EM % correct classification MI Figure 3: Classification of the iris data set. 100 data points were used for training and 50 for testing. Each data point consisted of 4 real-valued attributes and one of three class labels. The figure shows classification performance Sigma 1


Gabor Melli. A Lazy Model-Based Approach to On-Line Classification. University of British Columbia. 1989.

................ 88 7.2 Example of one algorithm (A 1 ) being more accurate than another (A 2 ). . . . 90 7.3 Accuracy performance on the iris dataset for several parameter combinations of the DI n ()basedalgorithm............................ 93 7.4 Parameter settings for the DI n () based algorithm that achieve the lowest


Wl odzisl/aw Duch and Karol Grudzinski. Prototype based rules - a new way to understand the data. Department of Computer Methods, Nicholas Copernicus University.

were extracted recently [2]. For comparison we have analyzed some of these dataset here. Iris flowers data, taken from the UCI repository [14], has been used in many previous studies. It contains 3 classes (Iris Setosa, Virginica and Versicolor flowers), 4 attributes (sepal and


H. Altay Guvenir. A Classification Learning Algorithm Robust to Irrelevant Features. Bilkent University, Department of Computer Engineering and Information Science.

VFI5 1NN 3NN 5NN 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of irrelevant features added 0.5 0.6 0.7 0.8 0.9 1.0 Classification accuracy Iris data set VFI5 1NN 3NN 5NN 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of irrelevant features added 0.5 0.6 0.7 0.8 0.9 1.0 Classification accuracy New-thyroid data set VFI5 1NN 3NN 5NN 0 1 2


Enes Makalic and Lloyd Allison and David L. Dowe. MML INFERENCE OF SINGLE-LAYER NEURAL NETWORKS. School of Computer Science and Software Engineering Monash University.

0.20, overfitting was observed -- MDL inferred four hidden neurons as optimal rather than three (see Fig. 4). 780 790 800 810 820 830 840 850 1 2 3 4 5 6 Message length (nits) Hidden Neurons Iris Dataset MML Figure 5. MML inference of the Iris dataset Finally, we have tested both MML and MDL-based criteria on a real problem: the Iris dataset from the UCI machine learning repository. This is


Ron Kohavi and Brian Frasca. Useful Feature Subsets and Rough Set Reducts. the Third International Workshop on Rough Sets and Soft Computing.

bears no resemblance to Holte's 1R algorithm. 1993), stopping after a predetermined number of non-improving node expansions. Figure 2 shows the search through the feature subsets in the IRIS dataset. The number in brackets denotes the order the nodes are visited. The bootstrap estimate is given with one standard deviation of the accuracy after the +=Gamma sign. The estimated real accuracy (on


G. Ratsch and B. Scholkopf and Alex Smola and Sebastian Mika and T. Onoda and K. -R Muller. Robust Ensemble Learning for Data Mining. GMD FIRST, Kekul#estr.

generalization performance of AdaBoost in the low noise regime. However, AdaBoost performs worse than other learning machines on noisy tasks [6, 7], such as the iris and the breast cancer benchmark data sets [5]. The present paper addresses the overfitting problem of AdaBoost in two ways. Primarily, it makes an algorithmic contribution to the problem of constructing regularized boosting algorithms.


YongSeog Kim and W. Nick Street and Filippo Menczer. Optimal Ensemble Construction via Meta-Evolutionary Ensembles. Business Information Systems, Utah State University.

with detailed information from most of input features to learn multiple patterns. Therefore, classifiers with information from few projected variables will not perform well. Note that, among 15 data sets, there are four multi-class data sets iris hypo, segment, and soybean) while the remaining 11 data sets are bi-class data sets. Out of four multi-class data sets, MEE shows consistently worse


Maria Salamo and Elisabet Golobardes. Analysing Rough Sets weighting methods for Case-Based Reasoning Systems. Enginyeria i Arquitectura La Salle.

are obtained from the UCI repository [MM98]. They are: breast cancer, glass, ionosphere, iris led, sonar, vehicle and vowel. Private datasets are from our own repository. They deal with diagnosis of breast cancer and synthetic datasets. Datasets related to diagnosis are biopsy and mammogram. Biopsy is the result of digitally processed


Lawrence O. Hall and Nitesh V. Chawla and Kevin W. Bowyer. Combining Decision Trees Learned in Parallel. Department of Computer Science and Engineering, ENB 118 University of South Florida.

0.6 < Petal-Width <= 1.5 and Petal-Length } 4.9 --} Iris Viginica R5: If 1.5 < Petal-Width <= 1.7 and Petal-Length } 4.9 --} Iris-Versicolor <= 1.7 Figure 1: The C4.5 tree produced on the full Iris dataset and the corresponding rules. adjust just one condition. For example, R1 no longer conflicts its test is adjusted to be petalwidthcm :5. A more complex problem is a condition in one rule overlaps


Anthony Robins and Marcus Frean. Learning and generalisation in a stable network. Computer Science, The University of Otago.

network. The effectiveness of pseudorehearsal at reducing catastrophic forgetting has been proven using a range of populations, including: randomly constructed autoassociative and hetroassociative data sets [Robins, 1995]; the Iris data set [Robins, 1996]; a classification task using the Mushroom data set [French, 1997]; and an alphanumeric character set using a Hopfield type network [Robins and


Geoffrey Holmes and Leonard E. Trigg. A Diagnostic Tool for Tree Based Supervised Classification Learning Algorithms. Department of Computer Science University of Waikato Hamilton New Zealand.

difference by the range of the tested attribute, giving the formula: cost = | v 1 - v 2 | max a 1 -min a 1 Figure 2 illustrates the problem for case 4 with an example taken from the familiar iris dataset. The minimum cost edit sequence to transform the tree on the left involves deleting the non-root Petal width nodes and their rightmost leaf nodes (giving a cost of 4). We are left with two trees


Shlomo Dubnov and Ran El and Yaniv Technion and Yoram Gdalyahu and Elad Schneidman and Naftali Tishby and Golan Yona. Clustering By Friends : A New Nonparametric Pairwise Distance Based Clustering Algorithm. Ben Gurion University.

procedure of the cross-validation index (see Section 3) and we only report the resulting cross-validation indices obtained during the computations. In section 5.1 we consider the classical Iris data sets. Then, in section 5.2 we consider the Isolet data set. An application to musical data is considered in section 5.3. 5.1. The Iris Data This data set, due to Fisher (Fisher, 1936), is a classic


Michael R. Berthold and Klaus--Peter Huber. From Radial to Rectangular Basis Functions: A new Approach for Rule Learning from Large Datasets. Institut fur Rechnerentwurf und Fehlertoleranz (Prof. D. Schmid) Universitat Karlsruhe.

extracted from a Neural Network trained on the data, rather than from the data itself. In this scenario the Neural Network already took care of the noisy patterns. B. The IRIS -data This very famous dataset from Fisher ([5]) contains three classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other two; the latter are not linearly


Norbert Jankowski. Survey of Neural Transfer Functions. Department of Computer Methods, Nicholas Copernicus University.

sphere defined by this metric. The influence of input renormalization (using Minkovsky distance functions) on the shapes of decision borders is illustrated in Fig. 30 for the classical Iris flowers dataset (only the last two input features, x 3 and x 4 are shown, for description of the data cf. [89]). Dramatic changes in the shapes of decision borders for different Minkovsky metrices are observed.


Karthik Ramakrishnan. UNIVERSITY OF MINNESOTA.

classifier is shown as a straight line across the x-axis for comparison purposes. . . . . . . . . . . . . . . . 39 vi 15 Bagging, Boosting, and Distance-Weighted test set error rates for the iris data set as the number of classifiers in the ensemble increases. The test set error rate for a single decision tree classifier is shown as a straight line across the x-axis for comparison purposes. . . . . .


Wl/odzisl/aw Duch and Rafal Adamczak and Geerd H. F Diercksen. Neural Networks from Similarity Based Perspective. Department of Computer Methods, Nicholas Copernicus University.

them on a unit sphere defined by this metric. 6 Pedagogical illustration The influence of non-Euclidean distance functions on the decision borders is illustrated here on the classical Iris flowers dataset, containing 50 cases in each of the 3 classes. The flowers are described by 4 measurements (petal and sepal width and length). Two classes, Iris virginica and Iris versicolor, overlap, and therefore


Fernando Fern#andez and Pedro Isasi. Designing Nearest Neighbour Classifiers by the Evolution of a Population of Prototypes. Universidad Carlos III de Madrid.

first version is due to the high number of centroids to eliminate. An example of the classifier found is given in #gure1(a), showing the centroids located in the mean of the distributions. 3.2 Iris Data Set Iris Data Set from UCI Machine Learning Repository 1 [3] is used in the second experiment. This dataset consits of 150 samples of three classes, where each class has 50 examples. The dimension of


Asa Ben-Hur and David Horn and Hava T. Siegelmann and Vladimir Vapnik. A Support Vector Method for Hierarchical Clustering. Faculty of IE and Management Technion.

cost of a decrease in efficiency, which makes our algorithm useful even for very large data-sets. To compare the performance of our algorithm with other hierarchical algorithms we ran it on the Iris data set [15], which is a standard benchmark in the pattern recognition literature. It can be obtained from the UCI repository [16]. The data set contains 150 instances each containing four measurements of


Lawrence O. Hall and Nitesh V. Chawla and Kevin W. Bowyer. Decision Tree Learning on Very Large Data Sets. Department of Computer Science and Engineering, ENB 118 University of South Florida.

0.6 < Petal-Width <= 1.5 and Petal-Length } 4.9 --} Iris Viginica R5: If 1.5 < Petal-Width <= 1.7 and Petal-Length } 4.9 --} Iris-Versicolor <= 1.7 Figure 1. The C4.5 tree produced on the full Iris dataset and the corresponding rules. The final rules will be ordered by their accuracy taken from the original tree in all cases except for conflict resolution rules for which the accuracy is calculated on


G. Ratsch and B. Scholkopf and Alex Smola and K. -R Muller and T. Onoda and Sebastian Mika. Arc: Ensemble Learning in the Presence of Outliers. GMD FIRST.

[17] explains the good generalization performance of AdaBoost in the low noise regime. However, AdaBoost performs worse on noisy tasks [10, 11], such as the iris and the breast cancer benchmark data sets [1]. On the latter tasks, a large margin on all training points cannot be achieved without adverse effects on the generalization error. This experimental observation was supported by the study of


Wl odzisl/aw Duch and Rudy Setiono and Jacek M. Zurada. Computational intelligence methods for rule-based data understanding.

larger input uncertainties do not change in subsequent minimizations. VIII. EXTRACTION OF RULES -- ILLUSTRATIVE EXAMPLE The process of rule extraction is illustrated here using the well-known Iris dataset, provided by Fisher in 1936. The data PROCEEDINGS OF IEEE, VOL. XX, NO. YY, 2003 17 have been obtained from the UCI machine learning repository [118]. The Iris data have 150 vectors evenly


H. Altay G uvenir and Aynur Akkus. WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS. Department of Computer Engineering and Information Science Bilkent University.

row of each k value presents the accuracy of the WkNNFP algorithm with equal feature weigths, while the second row shows the accuracy obtained by WkNNFP using Table 1: Comparison on some real-world datasets. Data Set: cleveland glass horse hungarian iris liver sonar wine No. of Instances 303 214 368 294 150 345 208 178 No. of Features 13 9 22 13 4 6 60 13 No. of Classes 2 6 2 2 3 2 2 3 No. of Missing


Huan Liu. A Family of Efficient Rule Generators. Department of Information Systems and Computer Science National University of Singapore.

and to compare with the results reported in [22] since they have done some comparison with other methods such as ID3 [14] and the one by Han et al [7]. Then, we show the results for another two data sets: Golf-Playing [13] and Iris [4]. The authors of [13, 22] did not provide testing data. Only the Iris data is divided evenly into two sets (75 patterns each) for training and testing. Datasets CAR


Rudy Setiono and Huan Liu. Fragmentation Problem and Automated Feature Construction. School of Computing National University of Singapore.

[21] which has 9 binary features x 1 ; x 2 ; : : : ; x 9 . The 512 instances are labeled as follows: (a) Class 1: x 1 x 2 x 3 + x 1 x 2 + x 7 x 8 x 9 + x 7 x 9 , (b) Class 2: Otherwise. ffl Iris dataset [6] which has 150 instances described by 4 continuous attributes: sepal length (A 1 ), sepal width (A 2 ), petal length (A 3 ), and petal width (A 4 ). Each pattern belongs to one of the 3 possible


Fran ois Poulet. Cooperation between automatic algorithms, interactive algorithms and visualization tools for Visual Data Mining. ESIEA Recherche.

by the user on the screen and the right part shows the transformed line (the best separating plane computed with the convex hulls). Fig. 6. An example of the automatic best separating plane on iris data set 2.3 Clustering The interactive algorithm described in the previous section can also be used for unsupervised classification. The computation of the convex hulls and the nearest points can be


Takao Mohri and Hidehiko Tanaka. An Optimal Weighting Criterion of Case Indexing for Both Numeric and Symbolic Attributes. Information Engineering Course, Faculty of Engineering The University of Tokyo.

(vote, soybean, crx, hypo) were in the distribution floppy disk of Quinlan's C4.5 book (Quinlan 1993). The remaining four data sets iris hepatitis, led, led-noise) were obtained from the Irvine Machine Learning Database (Murphy & Aha 1994). Including our 3 methods,VDM, PCF, CCF, IB4, and C4.5 are compared. Quinlan's C4.5 is a


Huan Li and Wenbin Chen. Supervised Local Tangent Space Alignment for Classification. I-Fan Shen.

containing multiple classes. The results obtained with the unsupervised and supervised LTSA are expected to be different as is shown in Fig.1. The iris data set [Blake and Merz, 1998] includes 150 4-D data belonging to 3 different classes. Here first 100 data points are selected as training samples and mapped from the 4-D input space to a 2-D feature space


Adam H. Cannon and Lenore J. Cowen and Carey E. Priebe. Approximate Distance Classification. Department of Mathematical Sciences The Johns Hopkins University.

data before implementing the ADC classification algorithm. Here, only the raw data has been analyzed using the same procedure described above. 5 Conclusions Results on the Wisconsin breast cancer data set and the Fisher iris data set compare very well with previous work on these data. The Pima Indian diabetes results are also nearly competitive with previous work. In all three cases it should be


A. da Valls and Vicen Torra. Explaining the consensus of opinions with the vocabulary of the experts. Dept. d'Enginyeria Informtica i Matemtiques Universitat Rovira i Virgili.

as L i+1 end if return d(P i ,P c ) calculated with the definition (1). end. 4.1 Experimental results We have made different tests on different domains. Particularly, we have considered a well-known data set: Iris [10], which has 150 flowers described by means of 4 numerical attributes: petal and sepal length, and petal and sepal width; and a second set of data built by 5 colleagues who have described


Wl/odzisl/aw Duch and Rafal Adamczak and Krzysztof Grabczewski. Extraction of crisp logical rules using constrained backpropagation networks. Department of Computer Methods, Nicholas Copernicus University.

a few cases. The final solution may be presented as a set of rules or as a network of nodes performing logical functions. III. Three examples A. Iris data In the first example the classical Iris dataset was used (all datasets were taken from the UCI machine learning repository [9]). The data has 150 vectors evenly distributed in three classes, called iris-setosa, iris-versicolor and iris-virginica.


Eric P. Kasten and Philip K. McKinley. MESO: Perceptual Memory to Support Online Learning in Adaptive Software. Proceedings of the Third International Conference on Development and Learning (ICDL.

sizes and feature counts. Data Set Size Features Classes Iris 150 4 3 ATT Faces 360 10,304 40 Mult. Feature 2,000 649 10 Mushroom 8,124 22 2 Japanese Vowel 9,859 12 9 Letter 20,000 16 26 Cover Type 581,012 54 7 set. As such, no


Karol Grudzi nski and Wl/odzisl/aw Duch. SBL-PM: A Simple Algorithm for Selection of Reference Instances in Similarity Based Methods. Department of Computer Methods, Nicholas Copernicus University.

the UCI repository [9] and contains 3 classes Iris Setosa, Virginica and Versicolor flowers), 4 attributes (measurements of leaf and petal widths and length), 50 cases per class. The entire Iris dataset has been shown here (Fig. 1) in two dimensions, x 3 and x 4 , which are much more informative the other two (cf. [10]). In Fig 2. the reference set obtained by taking the value of # from the


Chih-Wei Hsu and Cheng-Ru Lin. A Comparison of Methods for Multi-class Support Vector Machines. Department of Computer Science and Information Engineering National Taiwan University.

section we present experimental results on several problems from the Statlog collection [20] and the UCI Repository of machine learning databases [1]. From UCI Repository we choose the following datasets: iris wine, glass, and vowel. Those problems had already been tested in [27]. From Statlog collection we choose all multi-class datasets: vehicle, segment, dna, satimage, letter, and shuttle. Note


Alexander K. Seewald. Dissertation Towards Understanding Stacking Studies of a General Ensemble Learning Scheme ausgefuhrt zum Zwecke der Erlangung des akademischen Grades eines Doktors der technischen Naturwissenschaften.

ionosphere Compressed glyph visualization for dataset iris Compressed glyph visualization for dataset labor Compressed glyph visualization for dataset lymph Compressed glyph visualization for dataset primary-tumor Compressed glyph visualization for


Wl odzisl and Rafal Adamczak and Krzysztof Grabczewski and Grzegorz Zal. A hybrid method for extraction of logical rules from data. Department of Computer Methods, Nicholas Copernicus University.

for benchmark applications were taken from the UCI machine learning repository [14]. Application of the constructive MLP2LN approach to the classical Iris dataset was already presented in detail [15], therefore only new aspects related to the hybrid method are discussed here. The Iris data has 150 vectors evenly distributed in three iris-setosa,


Wl/odzisl/aw Duch and Rafal Adamczak and Geerd H. F Diercksen. Classification, Association and Pattern Completion using Neural Similarity Based Methods. Department of Computer Methods, Nicholas Copernicus University.

them on a unit sphere defined by this metric. 6PEDAGOGICAL ILLUSTRATION The influence of non-Euclidean distance functions on the decision borders is illustrated here on the classical Iris flowers dataset, containing 50 cases in each of the 3 classes. The flowers are described by 4 measurements (petal and sepal width and length). Two classes, Iris virginica and Iris versicolor, overlap, and therefore


Stefan Aeberhard and Danny Coomans and De Vel. THE PERFORMANCE OF STATISTICAL PATTERN RECOGNITION METHODS IN HIGH DIMENSIONAL SETTINGS. James Cook University.

means coincide. FDP performed very well for the exponential data. The results of the real data support the observations made from the simulations. FDP does not perform very well on well-defined data sets (wine data, Iris data), especially when compared to FF. It however compares somewhat better in the other cases, most noticeably in the case of the tertiary institutions data, where it equals the


Michael P. Cummings and Daniel S. Myers and Marci Mangelson. Applying Permuation Tests to Tree-Based Statistical Models: Extending the R Package rpart. Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland.

In this section we show several examples of the application of permutation tests to tree-based statistical models. We begin by permutation testing a classification tree built on the famous Iris dataset setosa: 50 versicolor: 50 virginica: 50 virginica: 0 versiolor: 0 setosa: 0 versicolor: 50 virginica: 50 virginica: 45 versicolor: 1 virginica: 5 versicolor: 49 setosa: 0 petal length < 2.45 cm


Ping Zhong and Masao Fukushima. Second Order Cone Programming Formulations for Robust Multi-class Classification.

problem as follows: max ®,¾,¿ e T ®- (¾ + ¿) s.t. ¯ E T ® = 0, ® · (1 - º)e, (38) ¾ - ¿ = º, ° ° ° ° ° ° 2 4 - 1 p 2(K+1) ~ A T ® ¿ 3 5 ° ° ° ° ° ° · ¾. Table 1: Description of Iris Wine and Glass datasets. name dimension (N) #classes (K) #examples (L) Iris 4 3 150 Wine 13 3 178 Glass 9 6 214 14 Table 2: Results for Iris, Wine and Glass datasets with noise (½ = 0.3, · = 2, º = 0.05). R a Robust (I)


Wl odzisl/aw Duch and Rafal Adamczak and Norbert Jankowski. Initialization of adaptive parameters in density networks. Department of Computer Methods, Nicholas Copernicus University.

network parameters, but it is interesting to note that these results are frequently already of rather high quality. Except for galaxies all other data was obtained from the UCI repository [13]. Iris dataset contains 150 cases in 3 classes. After initialization with Gaussian functions including rotations only 4 classification errors are made (97.3% accuracy), which is a better results than many


Aynur Akku and H. Altay Guvenir. Weighting Features in k Nearest Neighbor Classification on Feature Projections. Department of Computer Engineering and Information Science Bilkent University.

significantly. This should be because all the features are equally relevant. On the cleveland, liver, iris and glass (except k = 1) datasets, the weights learned by the individual accuracies always performed significantly better than the others. The weight learning method based on the homogeneity performed better than the other on the


Jun Wang. Classification Visualization with Shaded Similarity Matrix. Bei Yu Les Gasser Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.

similarity matrix is constructed and how it looks through an example. The data used in the example is part of the Iris data from the UCI repository [25]. There are 150 instances in the original Iris data set, which evenly distributed in 3 classes: setosa, virginica, and versicolor. For each class, we fetch its first 5 instances from the data file, and thus obtaining 15 instances (see Table 1). Table 2


Andrew Watkins and Jon Timmis and Lois C. Boggess. Artificial Immune Recognition System (AIRS): An ImmuneInspired Supervised Learning Algorithm. (abw5,jt6@kent.ac.uk) Computing Laboratory, University of Kent.

where classification accuracy of 98% was achieved using a k-value of 3. This seemed to bode well, and further experiments were undertaken using the Fisher Iris data set, Pima diabetes data, Ionosphere data and the Sonar data set, all obtained from the repository at the University of California at Irvine [4]. Table II shows the performance of AIRS on these data sets


Gaurav Marwah and Lois C. Boggess. Artificial Immune Systems for Classification : Some Issues. Department of Computer Science Mississippi State University.

satisfying some stimulation threshold, but the stimulation threshold for out of class ARBs was somewhat relaxed as compared to in class ARBs. Table 4 shows the accuracy rates obtained for the iris data set using the approaches just described. Five way cross validation was performed to achieve these results. Table 4: Accuracy Rates For Iris Data Set Using Different Approaches For ARB Pool Organization


Igor Kononenko and Edvard Simec. Induction of decision trees using RELIEFF. University of Ljubljana, Faculty of electrical engineering & computer science.

for patients suffering from hepatitis. The data was provided by Gail Gong from Carnegie-Mellon University. We also compared the performance of the algorithms on the following non-medical real world data sets (SOYB, IRIS and VOTE are obtained from the Irvine database (Murphy & Aha, 1991)): SOYB: The famous soybean data set used by Michalski & Chilausky (1980). IRIS: The well known Fisher's problem of


Daichi Mochihashi and Gen-ichiro Kikui and Kenji Kita. Learning Nonstructural Distance Metric by Minimum Cluster Distortions. ATR Spoken Language Translation research laboratories.

0 . 7 0 . 8 0 . 9 1 1 2 3 4 D i m e n s i o n P r e c i s i o n (c) iris dataset 0 . 6 0 . 7 0 . 8 0 . 9 1 1 2 5 1 0 2 0 3 5 D i m e n s i o n P r e c i s i o n (d) "soybean" dataset Figure 4: K-means clustering of UCI Machine Learning dataset results. The horizontal axis shows


Return to Iris data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML