Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Zoo Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.

Return to Zoo data set page.


Yuan Jiang and Zhi-Hua Zhou. Editing Training Data for kNN Classifiers with Neural Network Ensemble. ISNN (1). 2004.

Size Class annealing 33 5 798 6 credit 9 6 690 2 glass 0 9 214 7 hayes-roth 4 0 132 3 iris 0 4 150 3 liver 0 6 345 2 pima 0 8 768 2 soybean 35 0 683 19 wine 0 13 178 3 zoo 16 0 101 7 On each data set, 10 runs of 10-fold cross validation is performed with random partitions. The eŽects of the editing approaches described in Section 2 are compared through coupling them with a 3NN classifier. The


Mikko Koivisto and Kismat Sood. Exact Bayesian Structure Discovery in Bayesian Networks. Journal of Machine Learning Research, 5. 2004.

contain discrete variables only and no values are missing. The Zoo data set is available from the UCI Machine Learning Repository (Blake and Merz, 1998, the data set contributed by Richard Forsyth). It contains 17 variables and 101 records. The Alarm data set built by


Eibe Frank and Stefan Kramer. Ensembles of nested dichotomies for multi-class problems. ICML. 2004.

prim.-tumor 339 3.9 0 17 22 segment 2310 0.0 19 0 7 soybean 683 9.8 0 35 19 splice 3190 0.0 0 61 3 vehicle 846 0.0 18 0 4 vowel 990 0.0 10 3 11 waveform 5000 0.0 40 0 3 zoo 101 0.0 1 15 7 Table 1. Datasets used for the experiments differences in accuracy by using the corrected resampled t-test at the 5% significance level. This test has been shown to have Type I error at the significance level and


Eibe Frank and Mark Hall and Bernhard Pfahringer. Locally Weighted Naive Bayes. UAI. 2003.

3772 6.0 23 6 2 sonar 208 0.0 60 0 2 soybean 683 9.8 0 35 19 splice 3190 0.0 0 61 3 vehicle 846 0.0 18 0 4 vote 435 5.6 0 16 2 vowel 990 0.0 10 3 11 waveform 5000 0.0 40 0 3 zoo 101 0.0 1 15 7 19 datasets for k = 5 and k = 10 respectively. When distance weighting is used with k-nearest neighbours, our method is significantly more accurate on 13 and 17 datasets for k = 5 and k = 10 respectively.


Huan Liu and Hiroshi Motoda and Lei Yu. Feature Selection with Selective Sampling. ICML. 2002.

2 and 3 in Table 2) by simply treating them as continuous. The results are reported in Table 5. ReliefS works as well as or better than ReliefF except for 3 cases (some particular bucket sizes for data sets PrimaryTumor, Zoo Colic). The detailed re0.95 0.955 0.96 0.965 0.97 0.975 0.98 0.985 0.99 0.995 1 102030405060708090100 Precision Percentage by bucket size from 7 to 1 ReliefS ReliefF 0.95 0.955


Michael Bain. Structured Features from Concept Lattices for Unsupervised Learning and Classification. Australian Joint Conference on Artificial Intelligence. 2002.

pre-conditions of any operator and (c) it constructs a (conjunctive) definition as a single definite clause. In a first experiment we applied Conduce to a task of unsupervised learning. The zoo data set contains 101 instances. Each is described using 17 attributes and a unique name, such as aardvark, ostrich, seasnake, wasp, etc. It is an artificial data set and is not supposed to be taxonomically


Mukund Deshpande and George Karypis. Using conjunction of attribute values for classification. CIKM. 2002.

7 214 heart 13 0 2 270 hepati 6 13 2 155 horse 7 15 2 368 iris 4 0 3 150 labor 8 8 2 57 led7 0 7 10 3200 lymph 0 18 4 148 pima 8 0 2 768 tic-tac 0 9 2 958 wine 13 0 3 178 zoo 0 16 7 101 Table 1: UCI dataset statistics. We performed our experiments using a 10 way cross validation scheme and computed average accuracy across different runs. We ran our experiments using a support threshold of 1.0% for all


Neil Davey and Rod Adams and Mary J. George. The Architecture and Performance of a Stochastic Competitive Evolutionary Neural Tree Network. Appl. Intell, 12. 2000.

0 7 Aquatic 1 8 Predator 1 9 Toothed 0 10 Backbone 1 11 Breathes 1 12 Venomous 0 13 Fins 0 14 Legs 4 15 Tail 1 16 Domestic 0 17 Catsize 1 Table 3. The input vector for the platypus instance in Zoo data set. 4.2.2 Representational Results 4.2.2.1 Picture Data Results Figure 8 shows a typical result formed by using SCENT on the 153 vector data set. Each final and non final node is represented in the


Manoranjan Dash and Huan Liu. Hybrid Search of Feature Subsets. PRICAI. 1998.

1500 2000 2500 Ratio of #selected features to M Crossing Points of LVF and ABB 'LungCancer' 'Promoters' 'Soybean' 'Lymphography' 'Mushroom' 'Splice' Zoo 'Vote' 'LED17' 'Par3+3' (b) Added over all datasets 10 11 12 13 14 15 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Sum of ratios of #selected features to M for all datafiles Crossing Points of LVF and ABB (* TotalRuns) 'TotalRuns=1000' 'TotalRuns=2000'


Guszti Bartfai. VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui. Department of Computer Science PO Box 600. 1996.

capabilities of hart networks In order to demonstrate that the hart networks are capable of developing class hierarchies, we trained and tested the networks on the Zoo machine learning benchmark data set (Merz and Murphy, 1996). This data set contains 101 instances of animals described with 18 attributes such as ``hair'', ``aquatic'', ``domestic'' and so on. It is relatively small, but is adequate


D. Randall Wilson and Tony R. Martinez. Heterogeneous Radial Basis Function Networks. Proceedings of the International Conference on Neural Networks (ICNN. 1996.

higher generalization accuracy than RBF in 12 out of 23 cases, 10 of which were significant at the 95% level or above. RBF had a higher accuracy in only four cases, and only one of those (the Zoo data set) had a difference that was statistically significant. It is interesting to note that in the Zoo data set, 15 out of 16 of the attributes are boolean, and the remaining attribute, while not linear,


Christophe Giraud and Tony Martinez and Christophe G. Giraud-Carrier. University of Bristol Department of Computer Science ILA: Combining Inductive Learning with Prior Knowledge and Reasoning. 1995.

none of the continuous attributes are discretized. Table 1 - Selected Applications and Attributes Dataset #instances #inputs Input space flout. values zoo 90 16 Nominal 7 iris 150 4 Linear 3 lenses 24 4 Lin/Nom 3 hepatitis 155 19 Linear 2 voting-84 435 16 Boolean 2 glass 214 9 Linear 7 breast-cancer 699


Christophe G. Giraud-Carrier and Tony Martinez. AN INCREMENTAL LEARNING MODEL FOR COMMONSENSE REASONING. Department of Computer Science Brigham Young University.

cases are reciprocal). APPENDIX B Following are the precepts used in the simulations of Section 4. We give them in terms of features and predicted output. All other attributes are don't-care. zoo dataset: 1. If animal has four legs, then animal is a mammal 2. If animal has feathers, then animal is a bird 3. If animal lays eggs, is aquatic, and has fins, then animal is a fish lenses dataset: 1. If


Jun Wang. Classification Visualization with Shaded Similarity Matrix. Bei Yu Les Gasser Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.

because that object A is the nearest neighbor of object B does not imply B is the nearest neighbor of A. Figure a) in Fig. 8 (on the last page) is an illustration of this setting. The Zoo data set used in the figure comes from the UCI repository. It contains 101 instances with 7 classes {mammal, bird, reptile, fish, amphibian, insect, and invertebrate}. In the distance threshold setting, only


Mehmet Dalkilic and Arijit Sengupta. A Logic-theoretic classifier called Circle. School of Informatics Center for Genomics and BioInformatics Indiana University.

like contact-lenses, and weather, as well as large data sets like monks, mushroom, and Zoo As an example of the performance improvement, while full Circle took over an hour to terminate using the Zoo data set, the Randomized Circle with 8 attributes per


Alexander K. Seewald. Dissertation Towards Understanding Stacking Studies of a General Ensemble Learning Scheme ausgefuhrt zum Zwecke der Erlangung des akademischen Grades eines Doktors der technischen Naturwissenschaften.

#26 Training set (8-9=CV, 7=75%, 6=62%,.. 1=25%) Hold-out accuracy Figure 6.4: Learning curves for dataset primary-tumor to zoo 58 Chapter 7 Towards a Theoretical Framework In this chapter, we show that the ensemble learning scheme Stacking is universal in the sense that most ensemble learning schemes


Return to Zoo data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML